• 0 Posts
  • 19 Comments
Joined 1 year ago
cake
Cake day: July 4th, 2023

help-circle


  • Programmers, the senior ones who can court good money with relative ease at least, are gonna tend to be pretty well off, which I’m sure is part of it. For them, the concept of “skills gud, pay gud too, something something meritocracy vibes” pretty much applies (even if the reasons it works for them are probably not what they think) and afaik they don’t even have to fight for it with unions much of the time because the demand is high enough and the number of people at their skill level low enough. Entry level seems to be a much different story, having become saturated with all the bootcamp code stuff and “learn to code” rhetoric and such. But like, there’s stuff where it runs on some old programming language that virtually nobody learns or actively uses anymore, so knowing it could give you a lot of leverage.

    The moment these types of people were faced with hardship in employment and wages, I’m confident many of them would start questioning a lot of things they never thought much about before. But as long as they are a relatively comfy class in high demand, much of the class struggle can fly under the radar for them and through that, much of the rhetoric that might persuade them to think about imperialism as well.




  • Meta trained Llama 3.1 405b model on 16 thousand H100s: https://ai.meta.com/blog/meta-llama-3-1/

    So yeah, the scale of it can get wild. But from what I can tell, it seems like there’s a clear diminishing returns on usefulness of throwing more processing power at model training and more breakthroughs are needed in the architecture to get much meaningfully further on general model “competence.” The main problem seems to be that you need a ridiculous amount of decent data to make it worth scaling up. Not only in terms of the model showing signs of actually being “better”, but in terms of the cost to run inference on it when a given user actually uses the model. And quantization can somewhat reduce the cost to run it, but in exchange for reducing overall model competence.

    Right now, my general impression is that the heavy hitter companies are still trying to figure out where the boundaries of the transformer architecture are. But I’m skeptical they can push it much further than it has gone through brute forcing scale. I think LLMs are going to need a breakthrough along the lines of “learning more from less” to make substantial strides beyond where they’re at now.


  • Yes and no. It’s not a solved problem, but a worked around problem. Diffusion models struggle with parts that are especially small and would normally have to be done with precision to look right. Some tech does better on this, by increasing the resolution (so that otherwise smaller parts come out bigger) and/or by tuning the model such that it’s stiffer in what it can do but some of the worst renders are less likely.

    In other words, fine detail is still a problem in diffusion models. Hands are related to it some of the time, but are not the entirety of it. Hands were kind of like a symptom of the fine detail problem, but now that they’ve made hands better, they haven’t fixed that problem (at least not in entirety and fixing it in entirety might not be possible within the diffusion architecture). So it’s sorta like they’ve treated the symptoms more so.


  • This growing secrecy poses challenges for the U.S. in determining whether it or China possesses faster supercomputers, a question deemed crucial for national security. Supercomputers play a pivotal role in the U.S.-China technological rivalry, as the nation with superior machines gains an edge in developing advanced military technology, including nuclear weapons. Jimmy Goodrich, a senior adviser at the Rand Corporation, told WSJ that even a slight supercomputing advantage can significantly impact military capabilities.

    More psychological projection from the barbarian burger empire.


  • The best of LLMs at this point are pretty wild in how convincingly they can talk like a human and manage nuance, as well as a certain degree of creative output. But you’re not wrong in that they’re effectively just very mathy prediction engines. They can’t plan ahead because all they’re doing is predicting the next token. They don’t really know anything in the way that a human does or a database records something, they just have pre-established “confidence” in something from their training data, which might be applied more broadly or might be highly contextual to circumstance; like being able to get a height difference correct in a straightforward Q&A but failing to get it correct within the context of a fictional story.

    There is also a fundamental problem, I think, in evaluating a LLM AI’s “intelligence” in particular. The primary way with which to “talk to” it is to use human language. But human language was created by humans for humans to talk about human experience and the way humans perceive the physical tactile world around them, as well as the inner worlds we have in our minds. Supposing some form of “intelligence” emerged from an LLM’s “neural networks” - how would we distinguish it as such? Its “experience” would have basically no similarities with ours and its only tool to communicate such would be through the language we use for our experiences. So it would likely just come out looking like an AI imitating a human, which naturally LLMs tend to be good at, because they need to be for the whole “language model” task to work effectively. And the more it deviated from that, the more it would likely be taken as “poor training” and in need of fixing. Add to this the fact that it can’t plan ahead, so even if there was some kind of experiencing in there, it’d have no way to string together something with intention to communicate such in the first place.

    In other words, with LLMs, people are creating something which is more or less rated on its capability to communicate at the same level of effectiveness as a grown literate human, but which has virtually none of the basis that a human has. So it seems more likely LLMs will (and may already be) reaching a point of philosophical zombie stuff more so than developing unique “intelligence.”


  • I have some experience with modding and game making - not paid company work type of stuff, but studied it in college, have made (small) games on my own or with others and have done extensive modding in one game that got a fair bit of attention.

    I agree with cf saying it depends somewhat on the game. But also, overall modding is likely going to be easier for a number of reasons:

    1. Scope. Modding forces you to work within heavy constraints due to being unable to directly edit the game engine, source code, etc. For creative control, this is a drawback, but when you’re just trying to get something, anything made, it’s a help. It means what could be an overwhelming pool of possibility and a vision gone out of control becomes more akin to, “Okay, let’s see if I can change the color of this house.” In other words, it forces you to approach tasks as a smaller set of steps and in so doing, makes it easier to make some kind of progress at all, rather than none.

    2. Framework. Modding a game means there’s already an existing framework there, a game that functions relatively well, presumably has a decent gameplay loop, etc. So you don’t have to worry about, “Am I making something that will be utterly boring/unappealing/etc.” because there’s still the underlying game beneath it. So it’s a lot harder to spend time on something that isn’t enjoyable at all. And it means you have existing game design to mimic. In the game I heavily modded, some of the stuff I did was effectively repurposing features that were already there to use them slightly differently. I was still being creative and doing my own ideas, but much of the actual work of it was already done.

    Does this mean modding will always be easier than making your own game? Not necessarily. For example, you could make a simple console-based (like command prompt, not game console) grid game with C++ that uses ASCII characters to simulate where stuff is and a player moving from a starting point to a goal. Something I’ve done before. But, will this fulfill your desire to enact a creative vision? Probably not. The more you have to learn to get started, the harder it’s going to be to get to the creative part and that seems to be the part people usually crave as an entry point.

    Hope that makes sense!


  • That’s an interesting take on it and I think sort of highlights part of where I take issue. Since it has no world model (at least, not one that researchers can yet discern substantively, anyway) and has no adaptive capability (without purposeful fine-tuning of its output from Machine Learning engineers), it is sort of a closed system. And within that, is locked into its limitations and biases, which are derived from the material it was trained on and the humans who consciously fine-tuned it toward one “factual” view of the world or another. Human beings work on probability in a way too, but we also learn continuously and are able to do an exchange between external and internal, us and environment, us and other human beings, and in doing so, adapt to our surroundings. Perhaps more importantly in some contexts, we’re able to build on what came before (where science, in spite of its institutional flaws at times, has such strength of knowledge).

    So far, LLMs operate sort of like a human whose short-term memory is failing to integrate things into long-term, except it’s just by design. Which presents a problem for getting it to be useful beyond specific points in time of cultural or historical relevance and utility. As an example to try to illustrate what I mean, suppose we’re back in time to when it was commonly thought the earth is flat and we construct an LLM with a world model based on that. Then the consensus changes. Now we have to either train a whole new LLM (and LLM training is expensive and takes time, at least so far) or somehow go in and change its biases. Otherwise, the LLM just sits there in its static world model, continually reinforcing the status quo belief for people.

    OTOH, supposing we could get it to a point where an LLM can learn continuously, now it has all the stuff being thrown at it to contend with and the biases contained within. Then you can run into the Tay problem, where it may learn all kinds of stuff you didn’t intend: https://en.wikipedia.org/wiki/Tay_(chatbot)

    So I think there are a couple important angles to this, one is the purely technical endeavor of seeing how far we can push the capability of AI (which I am not opposed to inherently, I’ve been following and using generative AI for over a year now during it becoming more of a big thing). And then there is the culture/power/utility angle where we’re talking about what kind of impact it has on society and what kind of impact we think it should have and so on. And the 2nd one is where things get hairy for me fast, especially since I live in the US and can easily imagine such a powerful mode of influence being used to further manipulate people. Or on the “incompetence” side of malice and incompetence, poorly regulated businesses simply being irresponsible with the technology. Like Google’s recent stuff with AI search result summaries giving hallucinations. Or like what happened with the Replika chatbot service in early 2023, where they filtered it heavily out of nowhere claiming it was for people’s “safety” and in so doing, caused mental health damage to people who were relying on it for emotional support; and mind you, in this case, the service had actively designed it and advertised it as being for that, so it wasn’t like people were using it in an unexpected way from that standpoint. The company was just two-faced and thoughtless throughout the whole affair.


  • It never ceases to amaze me the amount of effort being put into shoehorning a probability machine into being a deterministic fact-lookup assistant. The word “reliable” seems like a bit of a misnomer here. I guess only in the sense of reliable meaning “yielding the same or compatible results in different clinical experiments or statistical trials.” But certainly not reliable in the sense of “fit or worthy to be relied on; worthy of reliance; to be depended on; trustworthy.”

    Since that notion of reliability has to do with “facts” determined by human beings and implanted in the model as learned “knowledge” via its training data. There’s just so much wrong with pushing LLMs as a means of accurate information. One of the problems being that supposing they got an LLM to, say, reflect the accuracy of wikipedia or something 99% of the time. Even setting aside how shaky wikipedia would be on some matters, it’s still a blackbox AI that you can’t check the sources on. You are supposed to just take it at its word. So sure, okay, you tune the thing to give the “correct” answer more consistently, but the person using it doesn’t know that and has no way to verify that you have done so, without checking outside sources, which defeats the whole point of using it to get factual information…! 😑

    Sorry, I think this is turning into a rant. It frustrates me that they keep trying to shoehorn LLMs into being fact machines.



  • I can explain more later if need be, but some quick-ish thoughts (I have spent a lot of time around LLMs and discussion of them in the past year or so).

    • They are best for “hallucination” on purpose. That is, fiction/fantasy/creative stuff. Novels, RP, etc. There is a push in some major corporations to “finetune” them to be as accurate as possible and market them for that use, but this is a dead end for a number of reasons and you should never ever trust what an LLM says on anything without verifying it outside of the LLM (e.g. you shouldn’t take what it says at face value).

    • LLMs operate on probability of continuing what is in “context” by picking the next token. This means it could have the correct info on something and even with a 95% chance of picking it, it could hit that 5% and go off the rails. LLMs can’t go back and edit phrasing or plan out a sentence either, so if it picks a token that makes a mess of things, it just has to keep going. Similar to an improv partner in RL. No backtracking and “this isn’t a backstory we agreed on”, you just have to keep moving.

    • Because LLMs continue based on what is in “context” (its short-term memory of the conversation, kind of), they tend to double down on what is already said. So if you get it saying blue is actually red once, it may keep saying that. If you argue with it and it argues back, it’ll probably keep arguing. If you agree with it and it agrees back, it’ll probably keep agreeing. It’s very much a feedback loop that way.





  • Oh yeah, I’m sure there’s a whole discussion to be had about capitalist media that depicts anti-capitalist themes, but does it in such a watered down way, it’s more of an aesthetic than an actual criticism of what’s happening. I’m not sure if it’s always meddling or if it’s more that the people writing it are too liberal to have a clue how to represent such a thing. I mean, I admit that even with what I know, it is a challenge to write a fictional representation of such matters because there is always some element of it being divorced from the realities, but I’m sure if I was writing a cyberpunk-esque story, it’d be one that involves people being organized against the source of the problems and contending with the unique technological challenges involved in opposing it.