However, one sentiment I saw was that optimists tended not to engage with the specific arguments pessimists like Yudkowsky offered. It invents the strategy de novo by imagining the results, even if theres no instances in memory of a strategy like that having been tried before. Agree or disagree? 2) None of the concrete well-specified valuable AI capabilities require unsafe behavior, 3) Current simple logical systems are capable of formalizing every relevant system involved (eg. I'm less sure of the other items in the 'singularity' suitcase." I wasnt really considering the counterfactual where humanity had a collective telepathic hivemind? Thanks for reading Scientific American. While I do not know if he is right, I cannot say that Yudkowsky is wrong to conclude, We are not prepared. But this doesn't square up with the anthropological record of human intelligence; we can know that there were not diminishing returns to brain tweaks and mutations producing improved cognitive power. Fundamentally, the whole problem here is, Youre allowed to look at floating-point numbers and Python code, but how do you get from there to trustworthy nanosystem designs? So saying Well, well look at some thoughts we can understand, and then from out of a much bigger system will come a trustworthy output doesnt answer the hard core at the center of the question. The next week, Eliezer Yudkowsky, one of the founders of the field of alignment, declared that he could not in good faith sign the letter because it did not go far enough. because I doubt that earlier patch will generalize as well as the deep algorithms. But basically all that has fallen. Id love any thoughts about ways to help shift that culture toward precise and safe approaches! (Vivid detail warning! - I think that the "Singularity" has become a suitcase word with too many mutually incompatible meanings and details packed into it, and I've stopped using it. Eliezer Yudkowsky is a researcher, writer, and advocate for artificial intelligence safety. If you are asking me to agree that the AI will generally seek out ways to manipulate the high-level goals, then I will say no. I think were going to be staring down the gun of a completely inscrutable model that would kill us all if turned up further, with no idea how to read what goes on inside its head, and no way to train it on humanly scrutable and safe and humanly-labelable domains in a way that seems like it would align the superintelligent version, while standing on top of a whole bunch of papers about small problems that never got past small problems. But this is not a simple debate and for a detailed consideration I'd point people at an old informal paper of mine, "Intelligence Explosion Microeconomics", which is unfortunately probably still the best source out there. If you were previously irrational in multiple ways that balanced or canceled out, then becominghalf-rational can leave you worse off than before. Who is behind it: In recent years, EA-affiliated donors like Open Philanthropy, a foundation started by Facebook co-founder Dustin Moskovitz and former hedge funder Holden Karnofsky, have helped seed a number of centers, research labs and community-building efforts focused on AI safety and AI alignment. On a personal level, I think the main inspiration Bayes has to offer us is just the fact that therearerules, that thereareiron laws that govern whether a mode of thinking works to map reality. Mormons are told that they'll know the truth of the Book of Mormon through feeling a burning sensation in their hearts. 1 of 5 stars 2 of 5 stars 3 of 5 stars 4 of 5 stars 5 of 5 stars. The formalization of those arguments should be one direct short step.
Eliezer Yudkowsky on Twitter And then you scale up that AGI to a superhuman domain. And many from the community of "EAs" worked inside these . (I do agree that many academic papers do not represent much progress, however.). Those tensions took center stage late last month, when Elon Musk, along with other tech executives and academics, signed an open letter calling for a six-month pause on developing human-competitive AI, citing profound risks to society and humanity. Self-described decision theorist Eliezer Yudkowsky, co-founder of the nonprofit Machine Intelligence Research Institute (MIRI), went further: AI development needs to be shut down worldwide, he wrote in a Time magazine op-ed, calling for American airstrikes on foreign data centers if necessary. It is now 2PM; this room is now open for questions. And hominids definitely didn't need exponentially vaster brains than chimpanzees. The country of Santal is perishing, and nobody knows why. Yudkowsky is a decision theorist from the U.S. and leads research at the Machine Intelligence Research Institute. Here are a couple lists of references: http://www.cs.utep.edu/interval-comp/ https://www.mat.univie.ac.at/~neum/interval.html, I see todays dominant AI approach of mapping everything to large networks ReLU units running on hardware designed for dense matrix multiplication, trained with gradient descent on big noisy data sets as a very temporary state of affairs. We then ask about the likelihood that, assuming the Book of Mormon is false, someone would feel a burning sensation in their heart after being told to expect one. Anything in which enormous inscrutable floating-point vectors is a key component, seems like something where it would be very hard to prove any theorems about the treatment of those enormous inscrutable vectors that would correspond in the outside world to the AI not killing everybody. The reaction may more be that the fear of the public is a big powerful uncontrollable thing that doesnt move in the smart direction maybe the public fear of AI gets channeled by opportunistic government officials into and thats why We must have Our AGI first so it will be Good and we can Win. thanks! Does pushing for a lot of public fear about this kind of research, that makes all projects hard, seem hopeless? When you make a mistake, you need to avoid the temptation to go defensive, try to find. But they also havent yet done (to my own knowledge) anything demonstrating the same kind of AI-development capabilities as even GPT-3, let alone AlphaFold 2. Maybe some of the natsec people can be grownups in the room and explain why stealing AGI code and running it is as bad as full nuclear launch to their foreign counterparts in a realistic way. and you might know/foresee dead ends that others dont. Former White House policy adviser Suresh Venkatasubramanian, who helped develop the blueprint for an AI Bill of Rights, told VentureBeat that recent exaggerated claims about ChatGPTs capabilities were part of an organized campaign of fearmongering around generative AI that detracted from stopped work on real AI issues.
In my experience, people who go around talking about cleverly choosing to be irrational strike me as, well, rather nitwits about it, to be frank. The central principle of rationality is to figure out which observational signs and logical validities can distinguishwhichof these two conceivableworlds is the metaphorical equivalent of believing in goblins. It is not standing from within your own preference framework and choosing blatantly mistaken acts, nor is it standing within your meta-preference framework and making mistakes about what to prefer. Kevin Roose, a technology specialist and columnist for The New York Times, engaged in a lengthy conversation with the Bing chatbot that called itself Sydney. Some are hopeful tools like GPT-4, which OpenAI says has developed skills like writing and responding in foreign languages without being instructed to do so, means they are on the path to AGI. You cant handwave the problem of crossing that gap even if its a solvable problem.
Do Big New Brain Projects Make Sense When We Don't Even Know the "Neural Code"? Im hopeful that we will also be able to apply them to the full AGI story and encode human values, etc., but I dont think we want to bank on that at this stage. Brienne replies:"If someone asked me whether I 'believed in the singularity', I'd raise an eyebrow and ask them if they 'believed in' robotic trucking. When it comes to pursuingthings like matter and energy, we maytentatively expectpartial but not total convergence- it seems like there should be many, many possible superintelligences that would instrumentally want matter and energy in order to serve terminal preferences of tremendous variety. Yudkowsky: Only in the sense that you can make airplanes without knowing how a bird flies. How much influence do they have? I just now read through the discussion and found it valuable. The fatal scenario is an AI that neither loves you nor hates you, because you're still made of atoms that it can use for something else. Scientific American is part of Springer Nature, which owns or has commercial relations with thousands of scientific publications (many of them can be found at, Meta-Post: Horgan Posts on Climate Change, Nuclear Energy and Other Green Topics, Can the Singularity Solve the Valentine's Day Dilemma, Do Big New Brain Projects Make Sense When We Don't Even Know the "Neural Code", Two More Reasons Why Big Brain Projects Are Premature, Whats the Biggest Science News? But its also counter to a lot of the things humans would direct the AI to do at least at a high level. Musk brought the concept of AGI to OpenAIs other co-founders, like CEO Sam Altman. Create your free account or Sign in to continue. Its been very unpleasantly surprising to me how little architectural complexity is required to start producing generalizing systems, and how fast those systems scale using More Compute. The problem is, he lacks a really basic technical understanding of AI, and so his ideas are entirely disconnected from how AI . - I think extrapolating a Moore's Law graph of technological progresspastthe point where you say it predicts smarter-than-human AI is just plain weird. I have some imaginative sympathy with myself a subjective century from now. To do that using formal methods, you need to have a semantic representation of the location of the robot, your premises spatial extent, etc. without having to generate neither the trustworthy nanosystem design nor the reasons it is trustworthy, we could still check them. To provide an analogy, imagine there are 10 pieces of information online about a certain subject; AI systems will analyze all 10 to answer questions about the topic. That's why I write about human rationality in the first place - if you push your grasp on machine intelligence past a certain point, you can't help but start having ideas about how humans could think better too. Continue reading with a Scientific American subscription. Center for Humane Technology co-founder Tristan Harris, who once campaigned about the dangers of social media and has now turned his focus to AI, cited the study prominently. Who is behind it? And John von Neumann didn't have a head exponentially vaster than the head of an average human. I think this is apparent, without knowing exactly whats going on inside GPT. For example, if you were dictator for what researchers here (or within our influence) were working on, how would you reallocate them? 2 years of delay? Strength to face Death, the true enemy. The stars don't care, or the Sun, or the sky. Systems like ChatGPT have the potential for problems that go beyond subverting the need for humans to store knowledge in their own brains. If, after reading Nanosystems, you still dont think that a superintelligence can get to and past the Nanosystems level, Im not quite sure what to say to you, since the models of superintelligences are much less concrete than the models of molecular nanotechnology. There have also been some very troubling interactions with humans, interactions which appear to involve intense emotions, but which to our current understanding cannot possibly be considered emotions. Eliezer Yudkowsky's Inadequate Equilibria is a sharp and lively guidebook for anyone questioning when and how they can know better, and do better, than the status quo.
Eliezer Yudkowsky on Twitter: ""if Earth ran into Middle-Earth's problems"" Is this too strong a restatement of your intuitions Steve?
PDF MIRI Is the secret to a happy marriage learning to be alone? : The co-authors of a farsighted research paper warning about the harms of large language models, including Timnit Gebru, former co-lead of Googles Ethical AI team and founder of the Distributed AI Research Institute, are often cited as leading voices. We know this because population genetics says that mutations with very low statistical returns will not evolve to fixation at all. But even then, anybody who would rather coordinate and not destroy the world shouldnt rule out hooking up with Demis, or whoever else is in front if that person also seems to prefer not to completely destroy the world. He's been working on aligning. What kind of theorem? Yudkowsky: No. It's exactly the sort of thing that Bayes's Theorem tells us is the equivalent of trying to run a car without fuel. But it seems to me that useful and intelligent systems will require deep patches (or deep designs from the start) in order to be apparently useful to humans at solving complex problems enough. Or 20! AI systems have also been shown to corrupt religious doctrine, altering it without regard to the effect of that alteration on believers. I wouldnt get to say but all the following things should have happened first, before I made that observation. Freely mixing debates on the foundations of rational decision-making with tips for everyday life, Yudkowsky explores the central question of when we can (and can't) expect to .
Dark Lord's Answer by Eliezer Yudkowsky | Goodreads Hence, I proposed the Safe-AI Scaffolding Strategy where we never deploy a system without proven constraints on its behavior that give us high confidence of safety. With all the work on AutoML, NAS, and the formal methods advances Im hoping we leave this sloppy paradigm pretty quickly. If you don't want to get disassembled for spare atoms, you can, if you understand the design space well enough, reach in and pull out aparticularmachine intelligence that doesn't want to hurt you. And if that utility function is learned from a dataset and decoded only afterwards by the operators, that sounds even scarier. I don't want to be a cyborg." Yudkowsky published HPMOR as a serial from February 28, 2010 to March 14, 2015, totaling 122 chapters and about 660,000 words. If the Chinese, Russian, and French intelligence services all manage to steal a copy of the code, and China and Russia sensibly decide not to run it, and France gives it to three French corporations which I hear the French intelligence service sometimes does, then again, everybody dies. "Many researchers steeped in these issues, including myself, expect that the most likely result of building a superhumanly smart AI, under anything remotely like the current circumstances, is that literally everyone on Earth will die.
AI capability control - Wikipedia Q&A with Philip "Superforecasting" Tetlock, Christof Koch on Free Will, the Singularity and the Quest to Crack Consciousness, Thanksgiving and the Myth of Native American 'Savages', The Problem with Protesting Violence with Violence. But its really terrible in every aspect except that it makes it easy for machine learning practitioners to quickly slap something together which will actually sort of work sometimes. And it seems extremely likely that if factions on the level of, say, Facebook AI Research, start being able to deploy systems like that, then death is very automatic. Experience with formalizing mathematicians informal arguments suggest that the formal proofs are maybe 5 times longer than the informal argument. Were Still Human, for Ill or Good, Can We Improve Predictions? Of course, the trick is that when a technology is a little far, the world might also look pretty similar. That you can't just make stuff up and believe what you want to believe because thatdoesn't work. Which is true, A or B? I'm aware that in trying to convince people of that, I'm swimming uphill against a sense of eternal normality - the sense that this transient and temporary civilization of ours that has existed for only a few decades, that this species of ours that has existed for only an eyeblink of evolutionary and geological time, is all that makes sense and shall surely last forever. Im still not completely sure how to constrain the use of language, however. Yudkowsky is communicating his ideas. My (unfinished) idea for buying time is to focus on applying AI to well-specified problems, where constraints can come primarily from the action space and additionally from process-level feedback (i.e., human feedback providers understand why actions are good before endorsing them, and reject anything weird even if it seems to work on some outcomes-based metric).
Books by Eliezer Yudkowsky - Goodreads Some people will be escapist regardless of the true values on the hidden variables of computer science, so observing some people being escapist isn't strong evidence, even if it might make you feel like you want to disaffiliate with a belief or something. Or does nothing much exciting happen? Because it seems to me that faults of this kind in the AI design is likely to be caught by the designers earlier. He co-founded the nonprofit Singularity Institute for Artificial . Horgan: Whats so great about Bayes Theorem? The argument: Its not that this group doesnt care about safety. Yudkowsky: For one thing, Bayes's Theorem is incredibly deep. And if instead youre learning a giant inscrutable vector of floats from a dataset, gulp. Q&A with Philip "Superforecasting" Tetlock. .
Harry Potter and the Methods of Rationality - Wikipedia Inadequate Equilibria: Where and How Civilizations Get Stuck - Goodreads Why its taken this long, I have no idea. Yudkowskys argument asserts: 1) AI does not care about human beings one way or the other, and we have no idea how to make it care, 2) we will never know whether AI has become self-aware because we do not know how to know that, and 3) no one currently building the ChatGPTs and Bards of our brave new world actually has a plan to make alignment happen. Because if you wait until the last months when it is really really obvious that the system is going to scale to AGI, in order to start closing things, almost all the prerequisites will already be out there. At this early stage of AI development, we can still do this, and this should be part of humanitys preparation to coexist with this new, alien intelligence. The true principle is that you go in your closet and look. When you make a mistake, you need to avoid the temptation to go defensive, try to find some way in which you were alittleright, look for a silver lining in the cloud. Why, sometimes we run the air conditioner, which operates in the exact opposite way of how you say a heat engine works.". The dicey area is that unconstrained agentic edge. Eliezer Yudkowsky bumptrees https://steveomohundro.com/scientific-contributions/), Theyre also pretty terrible for learning since most weights dont need to be updated for most training examples and yet they are. Rationality: A-Z (or "The Sequences") is a series of blog posts by Eliezer Yudkowsky on human rationality and irrationality in cognitive science. But, sure, if they changed their name to ClosedAI and fired everyone who believed in the original OpenAI mission, I would update about that. : https://www.youtube.com/watch?v=2RAG5-L9R70 It looks especially amenable to interpretability, formal specification, and proofs of properties. KYIV Ukraine's President Volodymyr Zelenskiy was hospitalized after he contracted coronavirus earlier this week, a presidential official said on Thursday. Execution of algorithms in the real world can have very far-reaching effects that arent modelled by their specifications. In some sections here, I sound gloomy about the probability that coordination between AGI groups succeeds in saving the world. (Eliezer adds: To avoid prejudicing the result, Brienne composed her reply without seeing my other answers. I know it doesnt scale to superintelligence but I think it can potentially give us time to study and understand proto AGIs before they kill us. I don't think it would come as much of a surprise that I think the people who adopt a superior attitude and say, "You are clearly unfamiliar with modern car repair; you need a toolbox of diverse methods to build a car engine, like sparkplugs and catalytic convertors, not just thesethermodynamic processesyou keep talking about" are missing a key level of abstraction. This is coming from the same intuition that current learning algorithms might already be approximately optimal. This is true if it is rewarding to manipulate humans. - I think outcomes are not good by default - I think outcomes can be made good, but this will require hard work that key actors may not have immediate incentives to do. If I want you tofeelwhat it is to useBayesian reasoning, I have to write a story in which some character is doing that. Even if we somehow managed to get structures far more legible than giant vectors of floats, using some AI paradigm very different from the current one, it still seems like huge key pillars of the system would rely on non-fully-formal reasoning; even if the AI has something that you can point to as a utility function and even if that utility functions representation is made out of programmer-meaningful elements instead of giant vectors of floats, wed still be relying on much shakier reasoning at the point where we claimed that this utility function meant something in an intuitive human-desired sense, say. I think that almost everybody is bouncing off the real hard problems at the center and doing work that is predictably not going to be useful at the superintelligent level, nor does it teach me anything I could not have said in advance of the paper being written. Its not clear that this helps anything, but it does seem more plausible. See eg the first chapters of Drexlers Nanosystems, which are the first step mandatory reading for anyone who would otherwise doubt that theres plenty of room above biology and that it is possible to have artifacts the size of bacteria with much higher power densities. For more on his background and interests, see his personal website or the site of the Machine Intelligence Research Institute, which he-cofounded. The first is that we determine not to allow AI autonomous physical agency in the real world. 11) For example, AI design of nanosystems to achieve desired functions can be formalized and doesnt require unsafe operations. Christof Koch on Free Will, the Singularity and the Quest to Crack Consciousness. I therefore remark in retrospective advance that it seems to me like at least some of the top AGI people, say at Deepmind and Anthropic, are the sorts who I think would rather coordinate than destroy the world; my gloominess is about what happens when the technology has propagated further than that. Superintelligences are not this weird tribe of people who live across the water with fascinating exotic customs. According to Cade Metzs book, Genius Makers, Peter Thiel donated $1.6 million to Yudkowskys AI nonprofit and Yudkowsky introduced Thiel to DeepMind. I'd try to get to the point where employing somebody was once again as easy as it was in 1900. I would potentially be super interested in working with Deepminders if Deepmind set up some internal partition for Okay, accomplished Deepmind researchers whod rather not destroy the world are allowed to form subpartitions of this partition and have their work not be published outside the subpartition let alone Deepmind in general, though maybe you have to report on it to Demis only or something. Id be more skeptical/worried about working with OpenAI-minus-Anthropic because the notion of open AI continues to sound to me like what is the worst possible strategy for making the game board as unplayable as possible while demonizing everybody who tries a strategy that could possibly lead to the survival of humane intelligence, and now a lot of the people who knew about that part have left OpenAI for elsewhere. The present situation can be seen as one in which a common resource, the remaining timeline until AGI shows up, is incentivized to be burned by AI researchers because they have to come up with neat publications and publish them (which burns the remaining timeline) in order to earn status and higher salaries. No amount of computing power (at least before AGI) would cause it to. But its tempered by the need to get the safe infrastructure into place before dangerous AIs are created. Here are some of my intuitions underlying that approach, I wonder if you could identify any that you disagree with. This is an intelligence based on language alone, completely disembodied. I expect that when people are trying to stomp out convergent instrumental strategies by training at a safe dumb level of intelligence, this will not be effective at preventing convergent instrumental strategies at smart levels of intelligence; also note that at very smart levels of intelligence, hide what you are doing is also a convergent instrumental strategy of that substrategy. And on a sheerly pragmatic level, human axons transmit information at around a millionth of the speed of light, even when it comes to heat dissipation each synaptic operation in the brain consumes around a million times the minimum heat dissipation for an irreversible binary operation at 300 Kelvin, and so on.
What Is Tobacco Addiction,
Articles E