I'm a computational linguist finishing my PhD in the Linguistics Program at the Graduate Center, City University of New York. Previously, I was at Microsoft Research's Sociotechnical Alignment Center for almost four years where I developed conceptual and operational measurement infrastructure grounded in validity theory for evaluating a wide range of AI systems.

My work (re-)evaluates language models Computational systems for processing and generating language, from finite-state models to statistical, hybrid, and neural models. , models of language Accounts of how linguistic systems work, examining the structures, patterns, and functions that characterize languages and their organization. , models of Language Theories of the human language faculty that account for the capacities and processes underlying linguistic representation, learning, and processing. , and models of specific languages Analyses of individual languages that model their sound systems, word-formation and sentence-structure patterns, meanings and discourse uses, and sociolinguistic or typological variation. , with attention to the assumptions and commitments built into these systems, and how those choices surface in NLP methodologies The reasoning and commitments that determine how research questions are posed, what counts as evidence, and how findings are interpreted. and methods The procedures and tools used to generate and analyze data, including experimental designs, datasets, annotation practices, and quantitative or qualitative techniques. .

I use that analysis to make the resulting gaps, distortions, and misrepresentations measurable, diagnosable, and actionable, and to build better models, datasets, and evaluation frameworks in response.

I'm also especially interested in representational infrastructure, from character/text encodings and XML schema to metadata standards and ontologies, and in how these systems shape what language data can represent, preserve, obscure, or distort.