Publications


My latest publication, co-authored by Heather Small, addresses ownership of faculty data in the HR system of a public institution. Prior to that, I worked on a Math Search challenge with Ray Larson and Fred Gey of UC Berkeley School of Information and UC Berkeley Institute for the Study of Societal Issues. The earlier four publications were co-authored by my colleagues and myself under the UC Berkeley Nuclear Forensic Search project, a grant-funded academic and applied research project. In all cases, I was a significant contributor to both the project work and the authorship of these publications.

"Who owns faculty data? Fairness and transparency in UCLA's new academic HR system"

C. Reynolds and H. Small

Paper iConference: The 2015 iConference was held on March 24-27, 2015 in Newport Beach, CA. Since 2005, the iConference series has provided forums in which information scholars, researchers and professionals share their insights on critical information issues in contemporary society. An openness to new ideas and research fields in information science is a primary characteristics of the event.

The iConference series is presented by the iSchools, a worldwide association of Information Schools dedicated to advancing the information field.

Abstract: Beginning in 2015, Opus will be the information system of record for faculty activities at the University of California, Los Angeles (UCLA). Opus will serve as both a profile system, storing data about faculty work, and as a workflow and approval engine for the promotion and tenure process. Opus leverages institutional master data wherever possible to collect data about faculty activity. However, repurposing institutional data collected for purposes not related to academic review necessitates allowing data subjects (UCLA faculty), to contextualize and reframe the data for the review process. Collecting, displaying and storing these augmented records (master data with manually added metadata from faculty) has forced the project team to grapple with questions regarding fairness and transparency to both data subjects and to data consumers. How can we hold to “good design” and usability practices, while faithfully representing the inherent “messiness” of the data? How does the context in which the data was collected impact repurposing the data for academic review? What does it mean to “own” faculty data? This paper outlines our attempts to address these questions, noting the tradeoffs and limitations of the selected solutions.

Paper

"The Abject Failure of Keyword IR for Mathematics Search: Berkeley at NTCIR-10 Math"

R. Larson, C. Reynolds and F. Gey

Paper NTCIR Conference: The NTCIR-10 Conference took place December 9-12, 2014, in NII, Tokyo, Japan. The NTCIR Workshop is an evaluation workshop aimed to advance the research of Information Access (IA) technologies such as Information Retrieval, Text Summarisation, Information Extraction, and Question Answering.

The objectives of National Institute of Informatics Testbeds and Community for Information Access Research (NTCIR) Workshop are: 1. Offer research infrastructure that allows researchers to conduct a large-scale evaluation of IA technologies 2. Foster form of researchers to share and exchange their findings based on comparable experimental results 3. Facilitate the research on evaluation methodologies and performance measures of IA technologies The prominent concern of NTCIR Workshop is how to apply laboratory research outcomes to real-world problems. Read more about the conference aims here.
Task Introduction
Task Description

Abstract: This paper demonstrates that classical content search using individual keywords is inadequate for mathematical formulae search. For the NTCIR10 Math Pilot Task, the authors used a standard indexing by content word for search coupled with search for components of mathematical formulae. This was followed by formula extraction from the top ranked documents. Performance was terrible, even for partial relevance. The further inclusion of some manual reformulation of topics into queries did not improve retrieval performance.

Paper

"Database Heterogeneity in a Scientific Application"

F. Gey, C. Reynolds, R. Larson and E. Sutton

The Nuclear Forensic Search Project team presented a poster at the IASSIST 2012 Conference, June 4-8, 2012 in Washington, DC.

Poster IASSIST Conference: This year's conference theme was Data Science for a Connected World: Unlocking and Harnessing the Power of Information. The theme reflects the growing desire of research communities and government agencies to build connections and benefit from the better use of data through practicing good management, dissemination and preservation techniques.

The theme is intended to stimulate discussions on building connections across all scholarly disciplines, governments, organizations, and individuals who are engaged in working with data. IASSIST as a professional organization has a long history of bringing together those who provide information technology and data services to support research and teaching in the social sciences.

Poster-Presentation

"Nuclear Forensics as a Digital Library Search Problem"

F. Gey, R. Larson, E. Sutton, C. Reynolds, D. Weisz and M. Proveaux

The Nuclear Forensic Search Project team presented at the DNDO-NSF ARI Grantees Conference, July 23-25, 2012 in Leesburg, VA.

Poster ARI Grantees Conference: The ARI is a joint Domestic Nuclear Detection Office (DNDO) and National Science Foundation (NSF) program seeking novel cross-cutting research that will enable the nation's ability to prevent and respond to nuclear or radiological threats. This continuing program intends to expand its scope this year to include research in response and recovery from nuclear or radiological attack, with emphasis on multidisciplinary approaches. This year's solicitation topics will encompass two broad areas. First are investigations in new technologies, concepts or approaches to enhance the Global Nuclear Detection Architecture (GNDA) that in turn will lead to improved capabilities for the detection and interdiction of nuclear or radiological threat materials or devices. Second are investigations to aid in the effective response and recovery from nuclear or radiological events at the local, state and Federal level, to include investigations in nuclear forensics. Primary objectives of ARI include advancing fundamental knowledge in the above areas and developing intellectual capacity in fields relevant to long-term advances in these areas.

Poster

"Nuclear Forensics: A Scientific Search Problem"

F. Gey, C. Reynolds, R. Larson and E. Sutton

The Nuclear Forensic Search Project team presented a paper “Nuclear Forensics: A Scientific Search Problem” to the LWA 2012 Conference, September 12-14, 2012 in Dortmund, Germany.

Paper LWA Conference: LWA stands for "Lernen, Wissen, Adaption" (Learning, Knowledge, Adaptation). It is the joint forum of four special interest groups of the German Computer Science Society (GI). Following the tradition of past years, LWA provides a joint forum for researchers to bring insights to recent trends, technologies and applications, and to promote interaction among the SIGs.

The GI-Special Interest Groups (SIGs) are:
-- FG-ABIS (Adaptivität und Benutzermodellierung in interaktiven Softwaresystemen);
-- FG-IR (Information Retrieval)
-- FG-KDML (Knowledge Discovery, Data Mining und Maschinelles Lernen)
-- FG-WM (Wissensmanagement)

Abstract: In this paper, we introduce a unique sub-field of scientific search: nuclear forensics. Nuclear forensics plays an important technical role in international security. We describe a conceptual model of nuclear forensics matching as a particular form of directed graph matching. The unique characteristic of this match is that the attributes of the graph nodes (measurement of mass of nuclear isotopes in a decay chain) vary over time, so matching must include a time-varying computation at the heart of the match. Using a database of spent nuclear fuel samples we formulate a search experiment to try to identify the particular nuclear reactor from which a particular sample came. Preliminary results (Precision 0.34 at rank 10) are promising given the simplifying assumptions made.

Paper

"Applying Digital Library Technologies to Nuclear Forensics"

F. Gey, C. Reynolds, R. Larson and E. Sutton

The Nuclear Forensic Search Project team presented a paper "Applying Digital Library Technologies to Nuclear Forensics" at the TPDL 2012 Conference, September 23-27, 2012 in Paphos, Cyprus.

Paper TPDL Conference: The International Conference on Theory and Practice of Digital Libraries is the successor of the European Conference on Research and Advanced Technology for Digital Libraries (ECDL). TPDL/ECDL has been the leading European scientific forum on digital libraries for 15 years. The conference brings together researchers, developers and content providers in the field.

Abstract: Digital Libraries will enhance the value of forensic endeavors if they provide tools that enable data mining capabilities. In fact, collecting data without such tools can result in investigators becoming overwhelmed. Currently, the quantity of highly dangerous radioactive materials is increasing with the advancement of civilizations' scientific inventions. This creates a demand for an equivalently sophisticated forensics capability that prevents misuse and brings malicious intent to justice. Our forensics approach applies digital library and data mining techniques. Specifically, the forensic investigator will utilize our digital library system which has been enhanced with advanced data mining query tools in order to determine attribution of material to their geographic sources and threat levels, enabling tracing and rating of smuggling activities.

The proceedings are published as a volume of Springer's Lecture Notes on Computer Science (LNCS) series.

Paper