Skip to content

Latest commit

 

History

History
600 lines (542 loc) · 20.9 KB

oscar-notes.md

File metadata and controls

600 lines (542 loc) · 20.9 KB

Dagstuhl EAS — Engineering Academic Software

http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=16252

Monday

Organization

  • JV
    • Motivation
      • Software needed everywhere
    • Goals
      • How to engineer AS of quality?
      • Roadmap
        • Recognition
          • How to get software recognized as an academic contribution
        • Best practices
        • Balance promises with skepticism
    • Output
      • Manifesto
        • Call to arms!
      • How to organize prizes
      • How to write proposals, letters of recommendation
      • Rules, best practices
      • Mythbustrs
      • SWOT Analysis
  • JH
    • Drafts of things to share?
      • Eg Resources, version control

Intros

  • Claude Kirchner
    • Design & implementation of robust, reliable systems
      • Using Clu, Lisp, C ...
    • Scientific director INRIA 2010-2014
  • Rob van Nieuwpoort
    • Cloud computing, compilers, ...
    • Making software part of the infrastructure
    • How to measure the impact of software?
      • Papers
      • Producing other software
  • Mike Croucher
    • Understanding importance of software
    • Emergency repair
      • Getting software in shape for the masses
      • How to engineer organizations to value software
  • Carole Goble
    • Infrastructure for life sciences
      • Multi-institution, multi community software development
    • Want software to get credit!
      • How to get reputation
  • Alice Allen
    • Editor of Astro Physics software (?)
      • Register all available software
      • Support reproducibility
      • Curate the repo
    • Want to help contributors get support
  • Katy Huff
    • Physics software (?)
    • Produces a lot of code
    • Software Carpentry ...
  • Oscar Nierstrasz
    • Software Evolution
    • How to get industry involved in academic software?
  • Cecilia Aragon
    • Visual analytics
    • Human centered data science
    • How to get software development effort recognized?
      • Contribute to career path
  • Christoph Becker
    • Management of digital resources for future use
      • Software sustainability
    • Software quality and curation
      • Long-term effects often not considered when software is built
  • Kevin Crowston
    • How technologies change the way people work
      • Eg open source development
      • Citizen science
    • How to build sustainability communities around open source software?
  • Andrei Chis
    • Improving how developers create software
      • Moldable development tools
      • Domain aware
      • Change tools to become moldable and extensible
      • Software and applications are essential to research
        • Many community challenges!
        • Make sure work gets recognized and rewarded
    • How to get academic tools adopted in industry
      • Having users is good and bad!
      • JV: old chinese curse — "I wish you many users"
  • Daniel Garijo
    • Scientific workflow
  • Benoit Combemale
    • Domain specific languages
      • Model driven development ...
    • Develop lots of software for real world case studies
      • Software to assess software impact
  • James Howison
    • Focus on collaboration
      • Development of software in science
  • Dan Katz
    • Computational scientist
    • Looking at how scientific software is developed
  • Matt Vaughn
    • Cyber infrastructure
    • Democratizing access
      • Creating environments for distributed, heterogeneous development
      • Industry standards, ...
      • Sharing digital objects
    • How to engage participation?
  • Matt Turk
    • Ensuring (astronomy etc) software is available to community at large
  • Katy Kuksenok
    • Cognitive resources
    • How are these resources shared and managed?
  • Jeff Carver
    • SE for Science
      • Empirical SE
      • Focus on human side
      • How do people develop software?
  • Ralf Laemmel
    • Software cristometries (?)
    • Megamodels
      • Models about models
      • Documenting software development
  • Rob Haines
    • Software sustainability
    • Support researchers in developing academic software
      • Complete gamut of researchers and skills
  • Caroline Jay
    • Human computer interaction
      • Needs lots of software, even to gather data
      • Much of the software is simply not available
  • Jurgen Vinju
    • I'm a programmer that somebody made into a professor.

Initiatives

  • Dan Katz — Sustainable Software for Science
    • WSSSPE
      http://wssspe.researchcomputing.org.uk/
      • Three workshops so far
      • Reports on line
    • Hot topics
      • White paper/journal paper about best practices in developing sustainable software
      • Funding Research Programmer Expertise
      • Software citation — Software Credit Working Group
  • Mike Croucher — Supporting Research Software Engineering
    • Emergency services for software
      • Faster code
        • Eg avoiding expensive operations in loops
        • Using vectors instead of loops
        • First use profilers and then look for anti patterns
        • Often order of magnitude speedup
      • Migrating to clusters
      • GPU Computing
    • Users "don't care" about clean engineering
    • How do you know users can still work with the code?
      • Small steps, back and forth
      • CA: we get data scientists to sit with users several times a week
      • Users are afraid I'm going to "do computer science" to them.
    • Problem
      • Software is not valued in academia
    • CA: We have to stop calling software "infrastructure".
    • The future
      • Core funded software support staff
      • Faculty tenure based on software output
      • The first RSE professor?
    • Please help
      • First contact
      • Demonstrating impact
      • New technology
      • Good practice
      • Changing culture
    • CK: INRIA has ~50 tenured engineers
      • Can be assigned to a project for shorter or longer periods
  • Christoph Becker
    • How to design for sustainability?
      http://sustainabilitydesign.org
      • Sustainability of what?
    • Sustainability debt
      • Effects of software systems
        • Immediate effect
        • Enabling effect
        • Structural effects
      • Five dimensions
        • Economic
        • Technical
        • Social
        • Environmental
          • Energy efficiency
        • Individual
      • Need examples of each

Possible breakout topic

  • Recognizing software as a primary research output
    • Recording it
    • Assessing it
    • Measuring it
    • Rewarding people for doing it

Breakout: Empirical study of software in conferences

  • Participants: Jeffrey Carver, James Howison, Robert Haines, Caroline Jay, Kevin Crowston, Oscar Nierstrasz
  • Venue-specific empirical survey of academic software
    • Select key/peak conferences
    • Goal: provide data set that can be used to answer a variety of questions
      • What are software practices in this domain?
      • What software is cited? How is it cited?
  • Research questions
    • Which software ends up being mentioned in papers?
    • What's the status of the developers?
      • PhD students?
      • Engineers?
    • Where is the software now?
    • What practices used?
      • E.g., version control, testing etc.
    • How was the technology chosen?
    • Who paid for the software?
      • What is the return on investment?
    • What problematic issues commonly arise?
      • What is difficult?
        • Pain points?
      • What recommendations would improve the quality of software in the fields?
        • Technology
        • Practices
        • ...
  • Procedural questions
    • How to achieve variance?
      • What do you code for?
        • Research questions?
        • Venues?
      • Grounded approach?
        • Or predefined hypotheses?
    • Would machine learning help to classify software in papers?
    • How to start?
      • Need a conference/domain with requirement for reproducibility
      • Interviews first to generate hypotheses
      • ...
    • How to structure?
      • Start with just 3 or 4 research questions

Breakout reports

https://goo.gl/4HZg4K

  • Academic software project typology
    • What is an academic sw project?
      • No consensus
    • Dimensions
      • Intentions to write software
        • For theory building and validation
        • As part of empirical research method
        • For SE itself
          • The sw is the output
        • Fix your own problem
          • Automate tasks
        • Demonstrators
        • Hobby projects
          • Exploratory programming
        • Teaching
        • Benchmarking
    • More dimensions
      • Characterizing audience
      • Maturity level
  • Examining sustainability for a particular project
    • Spider graphic for various dimensions
    • Sustaining software vs sustaining the team
  • Making the impact of software more visible?
    • Some fields like bioinformatics do better
    • Less values in fields like physics
    • DOIs for code with and without review
    • Institutional change
      • Through fear
      • Through threat
    • Need venue for software
    • Emphasize software more in recommendation letters
      • Provide templates
      • Workshops
    • Software awards

Tuesday

Plenary

  • Jeffrey Carver — SE practices in Science
    • Lessons learned
      • V&V is hard
      • Agile not useful
      • HL Languages are rare
    • SE-CSE Workshop series
      http://SE4Science.org
      • Facilitate interaction between SE and computational scientists
      • Very different pressures on software
        • Testability etc not considered as important
        • Dan Katz: half life of business sw is 6 years; for scientific sw it's 6 months
        • Challenge: eliminate stigma associated with SE
          • How do you introduce SE practices?
            • Demonstrate, don't tell
            • Solve an actual problem
              • Typically speed or bugs
              • Stealth git
            • Small steps; big value
            • Word of mouth
        • NB: scientific productivity ~= sw productivity
  • Matt Turk — Engineering yt
    • Astronomy simulation software
    • Many different systems and formats
      • NIH disease
    • yt project
      http://yt-project.org/
      • Python based
        • Cython C code all generated from Python
        • Bespoke C routines all removed
    • Over time, community grew from users to user-developers
      • Eventually hobbyist developers that did not use their own code
      • Big adoption spike when fully automated installation became available
        • Installs an isolated Python stack for your platform
    • Practices
      • yt enhancement proposals
      • Fido and Code reviews
        • Handle pull requests
          • All requests are reviewed
        • Continuous integration testing based on jenkins
      • Communication
      • Governance
        • Membership list
          • Contributors are voted in
          • Recognizes contributions
      • Code of conduct
      • No top-down coordinated team
        • Tasks are self-assigned
          • Kanban interface
    • Failure modes
      • Level of engagement
        • Get a life!
      • "Why was I not informed?"
        • "I own this code. Why was this change introduced without me in the loop?"
      • Innovation vs stability
        • Changes that break everything
      • Infrastructure dependencies
      • Sweeping changes
        • People involved have limited time available
      • Narrative documentation
        • Developers don't have experience
      • Overpromise, underdeliver
      • Underpromise, overdeliver
    • Notes
      • Most don't use IDEs
      • Most develop "pragmatically"
      • Lower diversity than field as a whole
        • Mostly white males
      • Little FLOSS experience
      • Innovation diffused and through yt
    • Citation is hard
  • Caroline Jay, Robert Haines — Software as Academic Output
    • When does sw count as scientific output?
    • Software is hidden
    • Role of sw in research
      • Enables research
      • Enabling research in a new way
        • Or to a new group
      • Software is the research
    • Top 23 CHI 2016 papers
      • 18 concerned software
      • Only 4 described software
        • Pseudocode
        • Full analysis results
        • Data + source code
        • Tool + source code
      • Only 2 provided source code
    • Case study: crowdsourcing cataloguing fossil database
      • Web app
      • RQ Should people be required to register before they can contribute to a study?
        • Answer: no, but this should be an option
    • Zenodo is a platform for publishing data sets
      http://zenodo.org
    • Academic software should be
      • Findable
      • Accessible
      • Reusable
      • Extensible
  • Claude Kirchner — Software Heritage
    • Ghezzi 2009 TOSEM paper
      • On 20% of tools from papers 2001-2006 are installable
    • Software Heritage
      http://www.softwareheritage.org
      • Collect all software and preserve it
      • Index and organize it
      • Unique identifiers
      • Web site to be launched next week

Wednesday

"Open mic" session

  • Daniel Garijo — Software Metadata
    • "Dark software" in geosciences
      • Phd-ware
        • "Don't worry, you don't have to start your code from scratch"
      • Counterpart of "dark data" [Heidorn 2008]
    • Bourne and Gil, quantifying value of software through "reproducibility maps"
      • 2 months efforts to reproduce a study
    • OntoSoft ontology for scientific software metadata [Gil et al 2015]
      http://ontosoft.org/ontology/software/
      • Six dimensions
  • Jurgen Vinju — Organising research team around the research software
    • Use the source
      https://github.com/usethesource
    • Lessons
      • A research team is not a software team
        • Fewer resources
        • More investment in efficiency
      • Seniors responsible for the long term
        • Maintenance, documentation ...
  • Dan Katz — Software Citation
  • Katy Kuksenok — Best Practices by Any Other Name
    • "I don’t want to use version control because I don’t want the world to see my terrible code."
  • Alice Allen — ACL: restoring reproducibility
    • Making astrophysics software accessible
  • Robert Haines — A short history of research software engineers in the UK
    • Several experiences contribution software to papers w/o acknowledgment
    • SSI Collaborations Workshop 2012
      http://software.ac.uk/cw12
      • Call to arms
      • New name: research software engineers
  • Ralf Lämmel —101companies

Thursday

Plenary

  • Cecilia Aragon — UE eScience Institute Initiatives
    http://escience.washington.edu/
  • Rob van Nieuwpoort — eScience in NL
    • Bridge gap between scientists and CS
    • Netherlands eScience center
      https://www.esciencecenter.nl/
      • Provide services in all NL
      • Three tracks
        • Management
        • Technology
        • Research
    • Project kickoffs are important
      • Establish rules for co-authorship
        • Research engineers
          • Also publish in eScience venues
        • Domain scientists
      • Agree on software licenses
    • eStep
      • Common repositories for knowledge resulting from projects
        • Software
        • Best practices

Breakouts

Friday

James Howison — depsy

http://depsy.org

Jurgen Vinju — Ossmeter

http://www.ossmeter.org/

  • Automatically analyze open source projects

Outputs

Manifesto

Report

  • Past
  • Program
  • Materials

Credit

  • Citations
    Dan
    • Reviewing FORCE11 etc
  • Taxonomy of software contributor roles
    • How to provide credit to all those involved in software
  • Call for ...
    • Guidance for tenure committees
    • Templates for recommendation letters
      James

Project typology

  • INRIA Evaluation grid
  • Taxonomy
    • Spreadsheet of example projects
    • Typology paper

Research Software Engineering Handbook

  • Handbook TOC

Empirical study

  • Study design

Future research directions

  • State of art
    • How much sw is open?

Visibility

  • Award proposal
  • Proposal for RCN Workshops

Sustainability

  • Sustainability Debt analysis example case

Useful links