I am a French New Yorker, and CS student at Columbia. Among many other things, I like to build stuff, learn new programming languages, travel, and rock climb. I also love to read, on topics ranging from econ to history and biology.
My research is in distributed systems, I currently focus on leveraging statistics and machine learning to improve data management and transparency. I am also interested in the impact of the sharing economy.
Data has become the principal asset of the Internet era. While this data offers
unique opportunities to improve personal and business effectiveness, it also
poses serious risks to users' privacy, and to organizations, by exposing
extensive data stores to external and internal attacks.
In my research, I build tools and design mechanisms that leverage statistics and
machine learning to: increase the current Web's transparency by revealing
how personal data is being used;
and enable a more rigorous and selective approach to big data collection, access,
and protection, to reap its benefits without imposing undue risks.
Data Use Transparency Infrastructure
To add transparency to data uses on the Web, I am building a series of
scalable, generic, and reliable tools to detect data flows within and
across web services. My initial system, XRay, offers a first system
design and theoretical building blocks to detect the use of digital
personal data for targeting and personalization. The key insight in XRay
is to infer targeting by correlating user inputs (such as searches,
emails, or locations) to service outputs (such as ads, recommendations,
or prices) based on observations obtained from user profiles populated
with different subsets of the inputs. My latest tool, Sunlight,
leverages rigorous statistical methods to determine the causes of online
targeting at great scale and based on solid statistical justification.
Mathias Lecuyer, Riley Spahn, Giannis Spiliopoulos, Augustin Chaintreau, Roxana Geambasu, and Daniel Hsu. "Sunlight: Fine-grained Targeting Detection at Scale with Statistical Confidence." (CCS'15) [PDF][Website][mention in The Economist]
Nicolas Viennot , Mathias Lecuyer, Jonathan Bell, Roxana Geambasu, and Jason Nieh. "Synapse: New Data Integration Abstractions for Agile Web Application Development." (EuroSys'15) [PDF][Website]
Mathias Lecuyer, Guillaume Ducoffe, Francis Lan, Andrei Papancea, Theofilos Petsios, Riley Spahn, Augustin Chaintreau, and Roxana Geambasu. "XRay: Increasing the Web's Transparency with Differential Correlation." (USENIX Security'14) [PDF][Website][NYT Bits article]