When I got home from surf on the morning of Friday January 14th, Denny said to take a look. Take a look at what?
A website I built with a friend decided to take flight. As the usage climbed hour after hour, the one thing I wanted to know most was what are users doing? I put my head down and built a live feed of the user submissions and responses.
As the requests flooded the feed, the behavior was clear. They use it for the same thing I had been using the tool for, for nearly two years. I had been using this tool in silence knowing that I needed it, wondering did anyone out there need it too?
I then read thorugh some Twitter commentary. The users are saying exactly the same things I had been saying to myself for a year. I left my computer and started to think.
Where did this come from? How did we get here? Where are we? Where can we go from here? Should we go? And to what end?
In this post, I intend to capture my feelings around this moment and clarify for myself the answers to the above questions.
Where did this come from?
In the process of working working on our new startup, we created a public space for anyone on the web to find us and interact with me and Denny. Yash showed up, eager to help. We didn’t have much for him to do because while we knew where we wanted to go, we lacked strong conviction around what to build to get us to that destination.
I like Yash, so while we couldn’t offer him work I suggested we meet weekly to discuss essays. I suspected that through discussion, we could each learn something new. For myself, it was a way for me to re-learn things I had forgotten and by giving advice on what made my startup work a weekly reminder to myself to follow what I preach.
Around July 2020, GPT-3 went into private beta. We toyed with it. In the playground OpenAI presented users with a pre-set setting to input text and GPT-3 would summarize the text for a 2nd grader. I started putting scientific abstracts into the playground.
That turned into summarizing abstracts every day in the playground and sharing the results publicly and with friends. It felt light. It felt fun. Then I started summarizing COVID-19 pre-print abstracts. That felt scary.
By pushing the thought of democratizing this summarization tool to its potential conclusion, I started to see clearly the risks of mechanizing the labor of processing and understanding lanüüguage. The one who does the labor does the learning. When the labor is outsourced, humans are left with an input and an output with no understanding of how input generated output. When you leave out the understanding of why A turned to B, give up that agency to machines, suddenly the world gets dark.
On October 28th, 2020 I wrote:
I worry a lot about GPT-3 being used for science summarization with no human supervision, but still have a strong desire to try and understand the bad and the good that can come from GPT-3. Very curious how to take bad uses and make them safe.
Now reflecting on this more than a year later, I still agree with the above statement.
To answer the question of where did this come from? Tldrpapers.com came from my desire to work with Yash, my personal lived experience as a scientist trying desperately to understand the scientific literature, and a curiosity to discover the evils of AI and if the evils discovered, to imagine how to fight the bad with good.
See a fire? Grab a friend and run towards it. The more dangerous the more interesting. If not us, then who?
How did we get here?
How did I know the idea of summarizing abstracts for a 2nd grader was worthwhile? It solves my own problem.
The feelings of reading primary literature are for me: annoying, frustrating, tedious. But we do it because there is content in that paper that we want to know that we cannot get elsewhere. So we suffer.
For a decade I ran a startup that persuaded everyday people to donate money to scientific experiments. In that I reviewed scientific proposals weekly probably over 10,000 in my tenure. For ten years, I reviewed scientific jargon for a living in a professional setting. My job was to digest what the scientific team had written into language that their friends and family of the scientist could understand. Our business makes a percentage fee off of ever dollar donated.
The better I was at translating scientific jargon into something a second grader could understand, the more likely an audience would get the gist of the proposal and give money. If we got that funnel right, the business made revenue. Revenue is needed to employ the labor required to keep the service running.
Beyond doing the labor myself, I was also responsible for training my colleagues to review and advise scientists in translating their ideas into easy to understand language, without simplifying the science and staying completely honest.
One of the first things I did with the GPT-3 playground was input abstracts that I had reviewed to see what it would output. The results were not wrong, but not entirely accurate. I started to use the results as a generative tool.
Instead of using my labor to write the first draft of the summary myself, I used GPT-3 to generate a first draft for me to refactor into accurate language.
I knew that democratizing access to this tool would produce accurate sounding, inaccurate langauge. That worreid me. Knowing this I still had a desire to see how dark the timeline could get, to collect first hand data on how scientists would respond. The purpose of knowing this dark timeline is to start to understand what tooling is needed to lift the scientific society out of that darkness, to determine what type of governance would be required to for the good of the technology to outweigh the bad.
I saw GPT-3 no differently from a tool such as a hammer or pencil. GPT-3 like all scientific discoveries have no agenda. What matters after a discovery is discovered is how we as a community of mortals use the tool to our advantage or use the tool to our destruction.
I encouraged Yash and I to continue with the project because the utility solved a problem I know intimately and I knew that if I didn’t try to understand the problem the problem would eventually catch up to me. So, I continued because I wanted to see how bad this good thing could get and in the badness see how much control we have over its strength.
Where are we?
When Denny said to take a look, he meant at the responses to a tweet that started to get retweeted. Seemingly out of nowhere, that piece of content had spawned scientists all over twitter to submit abstracts and screenshot the GPT-3 summarization and share with their followers.
At the time of me taking a look, OpenAI showed 6,488 requests of GPT-3. I added a bot to send to a Slack channel every request and GPT-3 response. Once that was set up, the feed was non-stop.
My inclination was to do nothing and to hold tight and let it ride out. That’s what I did.
Later that day I messaged Yash:
I guess we made something people want afterall. ¯\_(ツ)_/¯
Then on Saturday morning Yash reported that we’re at 50k users and 1,000 new followers on Twitter. He suggested we spend some time to think about what we want to do.
We talked about it. We discussed our personal goals for the project, what we could work on if we did work on it. I am pretty ambivalent about the project. I feel that the project served its purpose, it brought Yash and I together to make something useful. Then the tool continued to bring a smile to my face every time I used it. That’s all I ever expected of it.
Then Sunday morning I got a text from Yash.
I need to turn off tldr papers for a bit until the openAI safety team can take a closer look at it.
So then Yash turned it off. I left at 5:55AM to surf. That’s that.
Where can we go from here?
It is clear that the tool can bring users joy. The summaries are refreshing. They are mysterious. They are surprising. When not taken seriously, the summaries instill a much desired levity into society.
Tldr papers is an art piece. Art illuminates the truth. The expressions from the humans that interact with the art communicates a desire that we all know to be true, but for some reason no one talks about. No one talks about this desire because it is so pervasive that it is nonobvious how we can address it. When there is no hope for how to change something we know is bad, it is difficult to try. When something seems impossible, only the ignorant have the will to try. The establishment will laugh at them. And those who understand the problem but with no faith that we can change have their hands tied and can only muster the energy and courage to joke about the situation.
When the people that can't do it tell the people who are doing it it can't be done, we get the outcome we see today. It is an uncomfortable position. It forces the pessimist to see what they are seeing.
Let the art do the art. Let the software interact with society without the interference of its creators.
When viewing Tldr papers as a piece of art, I am proud of what we made. We introduced a toy that got scientists to stop and think. We introduced a toy that helped people feel something. The feelings though not always positive, gave live humans life. It creates a partnership between machine and human that is inevitable in our destination.
Summarization of scientific abstracts for 2nd graders brings character to a corner of the web. That I like. I don’t think we need to change a thing. Let the tool live and breathe. Let those that come after us remix it, make it better in their own ways.
Let the truth show through the art. Scientific communication is elitist. Summarization of scientific communication is patronizing. Scientific society uses fancy words to keep the people out. When we as a society decide to stop unconsiously perpetuate the evils of scientific jargon and consiously communicate using language that is simple and accurate, we give power to the people. We give people the power to participate in the scientific conversation and return agency to individuals.
We need more toys to show us how backwards our acceptance of the status quo is. The power of the people is greater than that of the people in power. And it is by giving people access to tooling that the people take back their power. When people regain the feeling of control they organize and make the world a better place.
We aren’t going to get where we want to go waiting for our leaders to take action. We get where we need to go when everyone feels the ability to take action.
Should we go?
Someone should go, but not me. Not right now.
Should we keep working on Tldr papers? I don’t particularly want to. I want Tldr papers to exist because I use it and I want other people to experience first hand what I experience, if only for them to question the future and re-evaluate how they want to contribute to a better or worse future. I want to work with Yash and many of my other friends, but I don’t feel particularly drawn to the problems of understanding the darkness of artificial general intelligence used for scientific summarization.
Someone should go. I want to support those that go, by using what they build, by encouraging an evolving governance for humanity to constantly evaluate if we should. I remain worried about how this will be used. That worry translates for me an urgency to understand this problem and the need for experiments to see first hand the unexpected side effects.
I am not that someone, but those that are going please keep going, and those that want to go, go.
There is something more interesting that occupies my mind and I know for myself two obsessions is one too many.
And to what end?
Does GPT-3 return language that is wrong? Yes. Mean? Yes. Should that stop us from understanding the utility of AI? No. Should that stop us from democratizing access to tools like Tldr papers? No. If we limit the access of Tldr papers to the elite, we create a situation where the elite continues to learn and gain knowledge that the masses cannot. This creates a wider and wider gap in the information asymetry with the have and have nots.
The right thing to do here is to democratize access and monitor usage, to allocate resources to the teams that want to dedicate their life’s work to understanding how to use AI for good. To periodically re-evaluate what it means to do good. Let the people participate and outline a set of values and a purpose that draws the following of the masses united for a better future, a future that serves the many not the few.
Because you can, does not mean you should. And if you have, are, and will to unite the craftsmen to work together towards a common good.
January 16, 2022