“Communicate First, Standardize Second”
That quote from Jean-Claude Bradley is on slide 34 of Cameron Neylon’s presentation “Open Access, Open Data. Open Research?” Very worthwhile…
That quote from Jean-Claude Bradley is on slide 34 of Cameron Neylon’s presentation “Open Access, Open Data. Open Research?” Very worthwhile…
From Intel’s Research blog, a post about ScienceSim (a virtual world in which you can build scientific instruments). If I had more energy, I’d include a link to William Gibson saying that our children will find our distinction between the real and the virtual quaint.
Wow: the folks at Eigenfactor have created some stunning and useful visualizations of citation patterns in science. Lots of fun to play with… (via The Great Beyond)
Jon Pipitone has written a succinct description of his (increasingly likely) thesis direction. If you’re a high school science teacher, or know any (preferably in Toronto), I’m sure he’d like to hear from you.
I just handed in paperwork to say that Samira Ashtiani Abdi, Jeremy Handcock, and Carolyn MacLeod have completed their Master of Science degrees in Computer Science, and that Samira and Carolyn should be admitted to the PhD. (Jeremy’s going to go be a ski bum instead.) Congratulations to all three—I’m proud to have worked with them.
Samira Ashtiani Abdi: Recovering Related Artifacts in Software Projects’ History: a Comparison of Information Retrieval Based Methods
The average software project is made up of a rich history of artifacts or change events: discussions in mailing lists or forums, bug reports, source code revisions, documentation or wiki page edits, patches, and feature or support requests. Although each of these artifacts are independent, many events may be logically related. For example, an email in a proejct’s mailing list archives may discuss a related bug report. Similarly, the bug report may be fixed by a source code revision in version control repository, and this may cause a change in the wiki pages too. We present a system to automatically cluster related events into logical groupings, or find the artifacts related to a target artifact. In addition, we present a visualization tool to help project stakeholders explore these artifacts, which also let them to further expand their search to retrieve more related artifacts in a tree based fashion. We provide a comparison of three Information Retrieval based methods: Vector Space Model (VSM), Latent Semantic Indexing (LSI), and a novel approach based on approximating the most distinguishing words or phrases considering the properties of software repositories.
Jeremy Handcock: How Developers Use an Awareness Tool: Patterns and Usage Scenarios
Software developers consult numerous sources of information in order to maintain awareness of what happens within their teams. They seek out information about what their peers are doing and what project artifacts have changed. Although researchers have proposed many tools to facilitate developers in maintaining awareness, there is currently a lack of understanding about how developers might use them—if at all—in real software projects. We have developed a new awareness tool for software developers called Aufait. We studied developers using Aufait in two organizations over a three-week period and found that they adopted it and used it regularly. We identified a number of usage scenarios in each organization: most commonly, developers used Aufait to manage dependencies between team members and to determine how changes might affect them. We also found common usage patterns. Throughout the course of the study, developers were most interested in changes to source code relative to other artifact types and they were primarily interested in changes that occurred very recently. Together, these scenarios and patterns suggest necessary features and implications for the design of future applications. Our results also pose interesting areas of future research.
Carolyn MacLeod: Patterns in Novice Design Analysis Using Spin
I examined and categorized novice Spin users’ difficulties and errors in using it to solve simple concurrent design and algorithm problems, using a qualitative analysis and a grounded theory approach. I found that the novices had greatest difficulties in dealing with nondeterminism and the Promela control flow, and found it very difficult to specify properties in any formal way. In particular, I found that novices had difficulties in formalizing natural language specifications, and in defining abnormal behaviour.
Cory J. Kapser and Michael W. Godfrey: “‘Cloning considered harmful’ considered harmful: patterns of cloning in software.” Empirical Software Engineering, 2008, 13:645-692, DOI 10.1007/s10664-008-9076-6.
“…we have found significant evidence that cloning is often used in a variety of ways as a principled engineering tool.” Comes with a catalog of beneficial cloning design patterns.
“…a large-scale controlled inspection experiment with over 70 professionals was conducted at Microsoft that focused on the relationship between an inspector’s background and his effectiveness during a requirements inspection. The results of the study showed that inspectors with university degrees in majors not related to computer science found significantly more defects than those with degrees in computer science majors. We also observed that level of education (Masters, PhD), prior industrial experience, or other job-related experiences did not signficantly impact the effectiveness of an inspector.”
“…we examined pair programming versus solo programming with respect to both thoroughness and fault detection effectiveness of test suites. Branch coverage (BC) and mutation score indicator (MSI) were used as measures of how thoroughly tests exercise programs, and how effective they are, respectively. It turned out that the PP practice did not significant affect GC and MSU.”
“The author proposes the Reproducible Research Standard for all components of scientific researchers’ scholarship, which should encourage replicable scientific investigation through attribution, the facilitation of greater collaboration, and the promotion of the engagement of the larger community…”
The abstract doesn’t do it justice, so I’ll summarize. Previous work to validate software metrics (cyclomatic complexity, coupling, etc.) has looked at correlations between metric values and things like post-release bug count. The authors repeated those experiments using bivariate analysis so that they could allocate a share of the blame to code size (measured by number of lines) and the metric in question. Turns out that code size accounted for all of the variation, i.e., the metrics didn’t add have any actual predictive power once you normalized for the number of lines of code.
Good post from Cameron Neylon about the lab notebooks of the future: “The traditional paper notebook is to the fully integrated web based lab record as a card index is to Google.”
I’ve updated my list of unwritten books; what else would you like to see?
I’ve been tagged by Michelle Levesque and David Bolter, so I figure I better respond:
I tag:
Recent Comments