Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Splash Page
Published:
Posts
articles
Limits for Learning with Language Models
Published in Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), 2023
Nevertheless, several recent papers provide empirical evidence that LLMs fail to capture important aspects of linguistic meaning. Focusing on universal quantification, we provide a theoretical foundation for these empirical findings by proving that LLMs cannot learn certain fundamental semantic properties including semantic entailment and consistency as they are defined in formal semantics. More generally, we show that LLMs are unable to learn concepts beyond the first level of the Borel Hierarchy, which imposes severe limits on the ability of LMs, both large and small, to capture many aspects of linguistic meaning. This means that LLMs will continue to operate without formal guarantees on tasks that require entailments and deep linguistic understanding. Read more
Recommended citation: Asher, N., Bhar, S., Chaturvedi, A., Hunter, J., & Paul, S. (2023). Limits for Learning with Language Models. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023) (pp. 236–248).
Analyzing semantic faithfulness of language models via input intervention on conversational question answering
Published in Computational Linguistics, 2024
Transformer-based language models have been shown to be highly effective for several NLP tasks. In this article, we consider three transformer models, BERT, RoBERTa, and XLNet, in both small and large versions, and investigate how faithful their representations are with respect to the semantic content of texts. We formalize a notion of semantic faithfulness, in which the semantic content of a text should causally figure in a model’s inferences in question answering Read more
Recommended citation: Chaturvedi, A., Bhar, S., Saha, S., Garain, U., & Asher, N. (2024). Analyzing Semantic Faithfulness of Language Models via Input Intervention on Question Answering. Computational Linguistics, 50(1), 119-155.
Strong hallucinations from negation and how to fix them
Published in Findings of ACL 2024, 2024
Despite great performance on many tasks, language models (LMs) still struggle with reasoning, sometimes providing responses that cannot possibly be true because they stem from logical incoherence. We call such responses strong hallucinations and prove that they follow from its computation of its internal representations for logical operators and outputs from those representations Read more
Recommended citation: Asher, Nicholas, and Swarnadeep Bhar. "Strong hallucinations from negation and how to fix them." arXiv preprint arXiv:2402.10543 (2024).
expressions
Post-Structuralism, Restructured: Re-presentations of Deleuze and Guattari’s A Thousand Plateaus
This was my final project for an Information Studies class I took back in 2006 at UT-Austin. Our assignment was to transform information from one form to another, and I chose to perform this analysis of Deleuze and Guattari’s A Thousand Plateaus. I scanned and OCRed the entire book and did a visual frequency representation of certain words. Read more
IPoXP: Internet Protocol over Xylophone Players
We introduce IP over Xylophone Players (IPoXP), a novel Internet protocol between two computers using xylophone-based Arduino interfaces. In our implementation, human operators are situated within the lowest layer of the network, transmitting data between computers by striking designated keys. We discuss how IPoXP inverts the traditional mode of human-computer interaction, with a computer using the human as an interface to communicate with another computer Read more
0 (the game)
One of the many forks of the popular game 1024 by Veewo Studio (which is conceptually similar to Threes by Asher Vollmer). Try to combine all the 0 tiles until they add up to 1. Read more
Apparent Things
A Twitter bot powered by tweets proclaiming that something ‘is apparently a thing.’ Read more
robots.txt.php
An algorithmically-generated robots.txt, which disallows all bots with one exception: the bot requesting the file is allowed full access. Read more
dystopedia
A Markov chain Twitter bot trained on titles of Wikipedia articles that have been deleted. Read more
AcademicPages
AcademicPages is a ready-to-fork GitHub Pages template for academic personal websites, based on structured data in markdown files. I created it for this website, then released it so others can make their own, which are hosted for free by GitHub. Over 500 people have! Read more
IndentationError
Published:
An avant-garde poem about Python Read more
talks
A Communicative Ethnography of Argumentative Strategies in a Wikipedian Content Dispute
Published in Exploring New Media Worlds, 2008
Conceptions and Misconceptions Academics Hold About Wikipedia
Published in Annual Wikimedia Conference (Wikimania), 2008
Working With/in Wikipedia: Infrastructures of Knowing and Knowledge Production
Published in Annual Conference on Science and Technology in Society, 2009
Evolving Governance and Media Use in Wikipedia: A Historical Account
Published in Media in Transition 6, 2009
Algorithmic Governance: The Social Roles of Bots and Assisted Editing Tools
Published in First Annual Wikiconference NYC, 2009
Trace Ethnography: An ANT Method for the Study of Sociotechnical Networks
Published in the Second Annual Media Sociology Forum, 2009
The Social Roles of Bots and Assisted Editing Tools
Published in International Symposium on Wikis and Open Collaboration, 2009
A short paper showing the recent explosive growth of automated editors (or bots) in Wikipedia, which have taken on many new tasks in administrative spaces. Read more
Where Are the Missing Wikipedians? The Sociology of a Bot
Published in Annual Meeting of the Society for the Social Study of Science (4S), 2009
The Wisdom of Bots: A Critique of ‘Self-Organization’ in Wikipedia
Published in Critical Point of View: Wikipedia and the Politics of Open Knowledge, 2010
The Work of Sustaining Order in Wikipedia: The Banning of a Vandal
Published in Conference on Computer Supported Cooperative Work, 2010
This paper traces out a heterogeneous network of humans and non-humans involved in the identification and banning of a single vandal in Wikipedia. Read more
Bot Politics: How is Automation Changing the Wikipedian Society? Critical Point of View II
Published in Critical Point of View: Wikipedia and the Politics of Open Knowledge, 2010
Academic Researchers in Wikimedia Communities: Ethics, Methods, and Policies
Published in Wikimania 2010, 2010
A panel intended to foster a dialog between academic researchers who study Wikimedia projects and the Wikimedia community. Read more
Trace Ethnography: Following Coordination through Documentary Practices
Published in Hawaii International Conference on System Sciences, 2011
We detail the methodology of ‘trace ethnography’, which combines the richness of participant-observation with the wealth of data in logs so as to reconstruct patterns and practices of users in distributed sociotechnical systems Read more
Machine-Generated Content: Bots and the Governance of Wikipedia
Published in Digital Media and Learning (DML), 2011
Participation in Wikipedia’s Article Deletion Processes (with Heather Ford)
Published in International Symposium on Wikis and Open Collaboration, 2011
This paper investigates Wikipedia's article deletion processes, finding that it is heavily populated by specialists. Read more
’The Internet is Here’: The Virtuality of ‘On-line Communities in Physical Spaces
Published in Annual Meeting of the Society for the Social Study of Science (4S), 2011
User-Generated Platforms in Wikipedian Governance
Published in Annual Meeting of the Society for the Social Study of Science (4S), 2011
Improving Wikipedia’s Notifications to Rejected Contributors
Published in GCOE International Symposium on Informatics Education, 2012
Black-boxing the user: internet protocol over xylophone players (IPoXP)
Published in Conference on Human Factors in Computing (CHI), 2012
We introduce IP over Xylophone Players (IPoXP), a novel Internet protocol between two computers using xylophone-based Arduino interfaces Read more
Hunting for Fail Whales: Lessons from Deviance and Failure in Social Computing
Published in Conference on Human Factors in Computing (CHI), 2012
Defense Mechanism or Socialization Tactic? Improving Wikipedia’s Notifications to Rejected Contributors
Published in International Conference on Weblogs and Social Media (ICWSM), 2012
A descriptive study of Wikipedia's highly-automated socialization processes and an A/B test to improve templated messages to newcomers. Read more
Trace literacy: a framework for holistically conceptualizing newcomer socialization in socio-technical systems
Published in Infosocial, 2012
Time to Degree: Examining the Experiences of Graduate Students in the Long-Term Ecological Research Network
Published in Annual Meeting of the Society for the Social Study of Science (4S), 2012
What Aren’t We Measuring? Methods for Quantifying Wiki-Work.
Published in International Symposium on Wikis and Open Collaboration (WikiSym 2012), 2012
Actor-Network Theory
Published in Social Aspects of Information Systems course, 2013
An introduction to Actor Network Theory for students in the Masters of Information Management and Systems (MIMS) course Read more
Using Edit Sessions to Measure Participation in Wikipedia (with Aaron Halfaker)
Published in Conference on Computer Supported Cooperative Work, 2013
This paper establishes a quantitative metric for measuring editor activity through temporal edit sessions. Read more
Community, Impact, and Credit: Where Do I Submit My Papers?
Published in ACM Conference on Computer-Supported Cooperative Work (CSCW), 2013
Values Where? Interrogating Client-Side Scripting as a Design Process
Published in Theorizing the Web, 2013
When the Levee Breaks: Without Bots, What Happens to Wikipedia’s Quality Control Processes? (with Aaron Halfaker)
Published in International Symposium on Wikis and Open Collaboration (WikiSym 2012), 2013
This paper examines what happened when one of Wikipedia's counter-vandalism bots unexpectedly went offline. Read more
Hadoop as Grounded Theory: Is an STS Approach to Big Data Possible? the 2013 Annual Meeting of the Society for the Social Study of Science 4S
Published in Annual Meeting of the Society for the Social Study of Science (4S), 2013
Design by Bot: Power and Resistance in the Development of Automated Software Agents
Published in Annual Meeting of the Association of Internet Researchers (AoIR), 2013
Size Matters: How Big Data Changes Everything
Published in Bangkok Scientifique, 2013
A talk introducing various concepts around large-scale data analysis to a general audience, including spam detection and governmental survellance. Read more
Robotic Ethics and Opportunities
Published in Robots and New Media, 2014
A panel discussing the ethical and political issues that are raised with autonomous robots and software bots. Read more
Governing the Commons
Published in History of Information, 2014
A lecture on the history of Wikipedia, in the broader context of the history of reference works. Read more
Successor Systems: Enacting Ideological Critique Through the Development of Software
Published in Theorizing the Web, 2014
Successor Systems: The Role of Reflexive Algorithms in Enacting Ideological Critique
Published in The Contours of Algorithmic Life, 2014
Data-Driven Data Research Using Data and Databases: A Practical Critique of Methods and Approaches in “Big Data” Studies
Published in Annual Meeting of the International Communication Association (ICA), 2014
This panel focuses on the challenges faced by researchers conducting mixed-method research into online platforms, particularly where large amounts of data are widely available. Read more
Big Data is Bullshit’: Scoping the Next 5 Years of Digital Data Research
Published in Annual Meeting of the International Communication Association (ICA), 2014
Successor Systems: The Role of Reflexive Algorithms in Enacting Ideological Critique
Published in Annual Meeting of the Society for the Social Study of Science (4S), 2014
Successor Systems: The Role of Reflexive Algorithms in Enacting Ideological Critique
Published in Annual Meeting of the Association of Internet Researchers (AoIR), 2014
Defining, Designing, and Evaluating Civic Values in Human Computation and Collective Action Systems (with Nathan Matias)
Published in Human Computation Conference (HCOMP), Citizen-X Workshop, 2014
We review various crowdsourcing and collective action systems, identifying particular sets of civic values and assumptions. Read more
Supporting Change from Outside Systems with Design and Data
Published in Berkman Center for Internet and Society, 2014
Does Facebook Have Civil Servants? On Governmentality and Computational Social Science
Published in CSCW Workshop on Ethics for Studying Sociotechnical Systems in a Big Data World, 2015
Situated knowledges and successor systems: developing CSCW systems to enact ideological critiques
Published in CSCW Workshop on Feminism and Feminist Approaches in Social Computing, 2015
Trace Ethnography Workshop
Published in ISchools Conference, 2015
Moderating Online Conversation Spaces
Published in Social Aspects of Information Systems course, 2015
An overview of how various online platforms moderate content, discussing issues that link up to the theories discussed in the Social Aspects of Information Systems class. Read more
Peer Production and Wikipedia
Published in Social Aspects of Information Systems course, 2015
An overview of Wikipedia and other peer production platforms, discussing issues that link up to the theories discussed in the Social Aspects of Information Systems class. Read more
But it Wouldn’t Be an Encyclopedia; It Would Be a Wiki: Wikipedia and the Repurposing of WikiWikiWeb
Published in Annual Meeting of the International Communication Association (ICA), 2015
In this talk, I examine the early history of “anyone can edit” wiki software – originally developed in 1995, six years before Wikipedia’s origin – focusing on the ways in which this technological infrastructure has been repurposed across communities, domains, and scales. Read more
Bot-Based Collective Blocklists in Twitter: The Counterpublic Moderation of a Privately-Owned Networked Public Space
Published in Annual Meeting of the Association of Internet Researchers (AoIR), 2015
This presentation introduces bot-based collective blocklists (or blockbots) in Twitter, which have been created to help various groups better moderate their own experiences on the site. Read more
Crowdsourcing: Theoretical Considerations
Published in Crowdsourcing and the Academy Symposium, 2015
A panel discussing how academics use crowdsourcing in research. Read more
The Bot Multiple: Unpacking the Materialities of Automated Software Agents
Published in Annual Meeting of the Society for the Social Study of Science (4S), 2015
I examine the roles that automated software agents (or bots) play in the governance and moderation of Wikipedia, Twitter, and reddit – three online platforms that differently uphold a related set of commitments to ‘open’ and ‘public’ online participation. Read more
Why bots are my favorite contribution to Wikipedia
Published in Wikipedia 15th Anniversary Birthday Bash, 2016
A short talk to open up an event celebrating the 15th anniversary of Wikipedia. The prompt we were given was "Why [x] is my favorite contribution to Wikipedia." Read more
Scraping Wikipedia Data
Published in The Hacker Within, BIDS, 2016
A tutorial (with Jupyter notebooks) about how to use APIs to query structured data from Wikipedia articles and the Wikidata project. Read more
“What the hack?” Hacking culture and discourse in data science pedagogy (with Brittany Fiore-Gartland)
Published in Theorizing the Web, 2016
Moderating harassment in Twitter with blockbots: a counterpublic and algorithmic strategy
Published in Theorizing the Web, 2016
Algorithms as agents of gatekeeping, governance, and articulation work in Wikipedia
Published in Algorithms, Automation, and Politics workshop, 2016
I discuss how algorithmic systems are deployed to enforce particular behavioral and epistemological standards in Wikipedia, which can become a site for collective sensemaking among veteran Wikipedians. Read more
Successor Systems: Lessons for Big Data From Feminist Epistemology and Activism
Published in Big Data: Critiques and Alternatives workshop, 2016
I discuss four data-intensive activist projects as "successor systems," discussing the political and epistemological implications of using data to advance activist projects. Read more
Drowning in Data: Industry and Academic Approaches to Mixed Methods in “Holistic” Big Data Studies
Published in Annual Meeting of the International Communication Association (ICA), 2016
This panel extends discusses the potentials and complications of mixed-methods research in big data studies, specifically in cases when population-level data is available. Read more
Administrative Support Bots in Wikipedia: How Automation Can Transform the Affordances of Platforms and the Governance of Communities
Published in Communicating with Machines workshop, 2016
I discuss cases from a multi-year ethnographic study of automated software agents in Wikipedia, where ‘bots’ have fundamentally transformed the nature of the ‘anyone can edit’ encyclopedia project. Read more
Governing Open Source Projects at Scale: Lessons from Wikipedia’s Growing Pains
Published in SciPy, 2016
Many open source, volunteer-driven projects begin with a small, tight-knit group of collaborators, but then rapidly expand far faster than anyone expects or plans for. I discuss cases of governance growing pains in Wikipedia, which have many lessons for running open source software projects. Read more
Community Sustainability in Wikipedia: A Review of Research and Initiatives
Published in PyData SF, 2016
Wikipedia relies on one of the world’s largest open collaboration communities. Since 2001, the community has grown substantially and faced many challenges. This presentation reviews research and initiatives around community sustainability in Wikipedia that are relevant for many open source projects, including issues of newcomer retention, governance, automated moderation, and marginalized groups. Read more
“The Wisdom of Bots:” An ethnographic study of the delegation of governance work to information infrastructures in Wikipedia
Published in Annual Meeting of the Society for the Social Study of Science (4S), 2016
Wikipedians rely on software agents to govern the ‘anyone can edit’ encyclopedia project, in the absence of more formal and traditional organizational structures. Lessons from Wikipedia’s bots speak to debates about how algorithms are being delegated governance work in sites of cultural production. Read more
Demystifying Algorithmic Processes: The Case of Wikipedia
Published in The 21st Annual BCLT/BTLJ Symposium, 2017
This talk is part of a panel session titled “Demystifying Algorithmic Processes: What is the role of algorithms in online platforms, what can they do and not do, and how should they be governed?” Read more
Jupyter and the Changing Rituals around Computation
Published in JupyterCon, 2017
We (Stuart Geiger, Brittany Fiore-Gartland, and Charlotte Cabasse-Mazel) share ethnographic findings made observing and working with Jupyter notebooks, focusing on how people use Jupyter to create and deliver computational narratives in particular local contexts, like classrooms, hackathons, research collaborations, and more. Read more
Autoethnographic Methods for Studying Data-Driven Knowledge Production
Published in 2017 Annual Meeting of the Society for the Social Studies of Science (4S), 2017
An overview of how to study data science ethnographically by personally engaging in various practices of data science. Read more
Computational Ethnography and the Ethnography of Computation
Published in Berkeley Institute for Data Science, 2017
Ethnography is traditionally a qualitative and inductive methodology – with its origins in cultural anthropology – that is now widely used to holistically investigate people’s lived experiences in and across cultures. In this talk, I define and discuss two ways of thinking about the role of ethnographic methods around computation, then discuss how my research relates to both. Read more
Are the bots really fighting? Behind the scenes of a reproducible replication
Published in UC-Berkeley Department of Statistics: Reproducible and Collaborative Data Science, 2017
A guest lecture for Fernando Perez’s STAT 159/259 course on Reproducible and Collaborative Data Science, in which I discuss issues of open science and reproducibility around our recent paper Operationalizing conflict and cooperation between automated software agents in Wikipedia: A replication and expansion of ‘Even Good Bots Fight’ Read more
“But it wouldn’t be an encyclopedia; it would be a wiki”: The changing imagined affordances of wikis, 1995-2002
Published in 2017 Annual Meeting of the Association of Internet Researchers, 2017
This paper examines the early history of “anyone can edit” wiki software – originally developed in 1995, six years before Wikipedia’s origin. While today, the idea of a wiki is associated with large-scale, massively-distributed encyclopedic knowledge production, this was not always the case. Articles on pre-Wikipedia wikis were often closer to a Joycean stream of consciousness than Wikipedia’s Britannica-inspired texts that speak in single voice, and the underlying wiki platform lacked many of the affordances that are now taken for granted in wiki platforms. In fact, the creator of the first wiki advised Wikipedia’s co-founders that the goals of creating a general-purpose encyclopedia and a wiki were inherently contradictory. Read more
The Humanity of Artificial Intelligence
Published in Bay Area Science Festival, 2017
Today, “artificial intelligence” seems to be everywhere – in our phones, vacuums, hospitals, and inboxes – but it can be hard to separate science fiction from science fact. Many discussions about AI imagine a fully autonomous superintelligence that designs itself with little to no human intervention, making decisions in ways that humans cannot possibly understand. Yet the work of designing, developing, engineering, training, and testing such systems requires a massive amount of human labor, which is typically erased when such systems are released as products. In this talk, I give a human-centered, behind-the-scenes introduction to machine learning, illustrating the creative, interpretive, and often messy work humans do to make autonomous agents work. Understanding the humanity behind artificial intelligence is important if we want to think constructively about issues of bias, fairness, accountability, and transparency in AI. Read more
Computational Ethnography and the Ethnography of Computation: The Case for Context
Published in School of Information and Library Science, University of North Carolina at Chapel Hill, 2018
Ethnography is traditionally a qualitative and inductive methodology that is now widely used to holistically investigate people’s lived experiences in and across cultures. In this talk, I define and discuss two ways of thinking about the role of ethnographic methods around computation, then discuss how my research relates to both. Read more
Computational Ethnography and the Ethnography of Computation: The Case for Context
Published in School of Information Sciences, University of Illinois at Urbana-Champaign, 2018
Ethnography is traditionally a qualitative and inductive methodology that is now widely used to holistically investigate people’s lived experiences in and across cultures. In this talk, I define and discuss two ways of thinking about the role of ethnographic methods around computation, then discuss how my research relates to both. Read more
Computational Ethnography and the Ethnography of Computation: The Case for Context
Published in College of Information Studies, University of Maryland at College Park, 2018
Ethnography is traditionally a qualitative and inductive methodology that is now widely used to holistically investigate people’s lived experiences in and across cultures. In this talk, I define and discuss two ways of thinking about the role of ethnographic methods around computation, then discuss how my research relates to both. Read more
Publics: Witnessing and Measuring
Published in UC-Berkeley: Human Contexts and Ethics of Data course, 2018
A guest lecture for Cathryn Carson and Margo Boenig-Liptsin’s course on Human Contexts and Ethics of Data (HIST 182C, STS 100C), focusing on how various publics generate, analyze, and interpret data. Read more
The Human Contexts of Data: Infrastructures, Institutions, and Interpretations
Published in University of Manchester, Data Science Institute, 2018
In this talk, I discuss the role of qualitative and ethnographic methods in relation to computer, information, and data science. These holistic, reflexive, and meta-level approaches to studying data and computation in context help us better understand how to both support and practice data analytics at various scales. Read more
Computational Ethnography and the Ethnography of Computation: The Case for Context
Published in IT University of Copenhagen, ETHOSlab, 2018
Ethnography is traditionally a qualitative and inductive methodology that is now widely used to holistically investigate people’s lived experiences in and across cultures. In this talk, I define and discuss two ways of thinking about the role of ethnographic methods around computation, then discuss how my research relates to both. Read more
Key Values: What We Talk About When We Talk About ‘Open Science’
Published in Open Science Symposium, Department of Second Language Studies, University of Hawaiʻi at Mānoa, 2018
Openness in science is hard to disagree with as an abstract principle, but what exactly do we mean when we call for science to be made open – or more open than before? In this talk, I introduce and unpack the many different goals, strategies, products, values, and assumptions of the broad open science movement. Read more
The Human Contexts of Computation and Data: Infrastructures, Institutions, and Interpretations
Published in University of California at San Diego, The Design Lab, 2018
In this talk, I discuss the role of qualitative and ethnographic methods in relation to computer, information, and data science. These holistic, reflexive, and meta-level approaches to studying data and computation in context help us better understand how to both support and practice data analytics at various scales. Read more
Knowing User Populations at Scale: From the Science of the State to Platform Governmentality
Published in 2018 Annual Conference of the International Communication Association, 2018
How can institutions that own and operate large-scale social media platforms come to know “their users” at scale? In this talk, I discuss ways of knowing user populations at scale, drawing on Foucault’s account of governmentality, particularly the role of statistics in the formation of the modern nation state. Read more
The Types, Roles, and Practices of Documentation in Data Analytics Open Source Software Libraries: A Collaborative Ethnography of Documentation Work
Published in 2018 European Conference on Computer-Supported Cooperative Work, 2018
Data analytics increasingly relies on open source software (OSS) libraries that extend scripted languages like python and R. Software documentation for these libraries is crucial for people across all experience levels, but documentation work raises many challenges, particularly in open source communities. In this collaboration between ethnographers and data scientists, we discuss the types, roles, practices, and motivations around documentation in data analytics OSS libraries. Read more
Designing and Using Data Science Ethically
Published in Machine Learning and User Experience San Francisco (MLUXSF), 2018
With the rise of Machine Learning and AI to solve human-focused needs, how do we design and use data science ethically to help empower and support people? Read more
Qualitative and Quantitative Studies of Wikipedia (with Aaron Halfaker)
Published in ACM International Symposium on Open Collaboration (OpenSym), 2018
We reflect on a decade of studying Wikipedia using qualitative and quantitative methods. Read more
Cooking Data with Care: The Role of Contextual Inquiry in Large-Scale Quantitative Research
Published in eScience Institute, University of Washington, 2019
In this talk, I argue that there is often substantial qualitative contextual inquiry and expertise deployed in quantitative methods. Such insights are crucial to ‘cooking data with care,’ as Geoff Bowker advocated. Read more
Documenting Data Science and Documentation in Data Science: an Ethnographic Exploration
Published in eScience Institute, University of Washington, 2019
In this talk, I discuss the central yet often passed over role of documentation in data science, based on several recent and ongoing studies and projects about the role and importance of documentation in software packages, datasets, analysis code, research protocols, and research teams. Read more
Ethics and Policy Implications of Big Data
Published in University of California, San Diego, 2019
Panelist on the ‘Knowledge and Culture’ panel at this workshop on algorithms and big data, sponsored by a number of different departments across UCSD. Read more
The Invisible Work of Maintaining & Sustaining Open-Source Software
Published in SciPy 2019, 2019
Opening keynote at SciPy 2019, in which I discuss a wide range of issues around the work of developing and maintaining open-source software, based on our team’s ongoing mixed-method research into this topic. Read more
Garbage In, Garbage Out? Do Machine Learning Application Papers in Social Computing Report Where Human-Labeled Training Data Comes From?
Published in ACM FAT* 2020, 2020
We investigate to what extent a sample of machine learning application papers in social computing — specifically papers from ArXiv and traditional publications performing an ML classification task on Twitter data — give specific details about whether such best practices were followed. Read more
’I didn’t sign up for this’: The Invisible Work of Maintaining Free/Open-Source Software Communities.
Published in 4S 2020, 2020
I discuss a wide range of issues around the burdens maintainers face in developing and maintaining open-source software, based on our team’s ongoing mixed-method research into this topic. Read more
Garbage In, Garbage Out Revisited: Labeling and Dataification Practices Across Disciplines.
Published in 2022 IEEE 8th International Conference on Collaboration and Internet Computing, 2022
I discuss a range of issues and best practices around data labeling, verification, and quality across disciplines. Read more
teaching
CCTP-505: Introduction to Communication, Culture, and Technology (Fall 2008)
Published:
Graduate course, Teaching assistant
CCTP-505 is an introduction to the Communication, Culture, and Technology M.A. program at Georgetown, which all incoming CCT students must take their first semester. Read more
CCTP-783: Qualitative Data Analysis (Fall 2009)
Published:
Graduate course, Teaching assistant
CCTP-783 is a core methods course for the CCT program, one of multiple classes M.A. students can take to satisfy their core methods requirement. Read more
INFO-203: Social Aspects of Information Systems (Spring 2012)
Published:
Graduate course, Teaching assistant
INFO 203 is a required course for the UC-Berkeley's Master of Information Management & Systems (MIMS) program, and open to graduate students from all departments. Read more
INFO-203: Social Aspects of Information Systems (Spring 2013)
Published:
Graduate course, Teaching assistant
INFO 203 is a required course for the UC-Berkeley's Master of Information Management & Systems (MIMS) program, and open to graduate students from all departments. Read more
INFO-103: History of Information (Spring 2014)
Published:
Undergraduate course, Teaching assistant
INFO 103 is an elective undergraduate course in the UC-Berkeley School of Information, crosslisted with History, Media Studies, and Cognitive Science. Read more
SOC-167: Sociology of Virtual Communities and Social Media (Spring 2014)
Published:
Undergraduate course, Adjunct lecturer
SOC 167 is an elective undergraduate course in UC-Berkeley's Sociology Department, providing a wide overview to how classic concepts in the social sciences play out in social media and virtual communities Read more
SOC-167: Sociology of Virtual Communities and Social Media (Summer 2014)
Published:
Undergraduate course, Instructor of record
SOC 167 is an elective undergraduate course in UC-Berkeley's Sociology Department, providing a wide overview to how classic concepts in the social sciences play out in social media and virtual communities Read more
Software Carpentry Instructor
Published:
Software Carpentry is a global non-profit organization that provides free, short workshops on scientific computing and data science. I have been a certified instructor with SWC since May 2016. Read more
Peer Learning Group Coordinator
Published:
Since Fall 2016, I have been the lead coordinator for The Hacker Within, a weekly peer learning group for scientific computing and data science, which is run out of the Berkeley Institute for Data Science. Read more
COGR 201C: Discourse Analysis: Classical, Critical, Computational (Spring 2021)
Published:
Graduate-level course on discourse analysis for UCSD Communication Read more
DSC 290: Algorithmic Auditing Lab (Fall 2021)
Published:
Graduate-level course on auditing algorithms for bias, fairness, etc. Read more
COMM 106D: Data and Culture (Winter 2022, UCSD)
Published:
Intermediate undergraduate course on the relationship between data and culture Read more
DSC 291: Data Science, Ethics, and Society (Winter 2022, UCSD)
Published:
Graduate-level course on social and ethical issues in data science Read more
COMM 106E: Data, Science, and Society (Spring 2022, UCSD)
Published:
Undergraduate course on social issues in data science Read more
COMM 106E: Data, Science, and Society (Fall 2022, UCSD)
Published:
Undergraduate course on social issues in data science Read more
COMM 106D: Data and Culture (Winter 2023, UCSD)
Published:
Intermediate undergraduate course on the relationship between data and culture Read more