• In the future, it could power virtual avatars that render locally and don’t require video feeds—or allow anyone with similar tools to take a photo of a person found online and make them appear to say whatever they want.

    …or open a dystopian hellscape of disinformation, scams and corporate police state control of information at an unprecedented scale! Three cheers for the reckless pursuit of technology for technologies sake!

      • I understand AI evangelists - which you may or may not be idk - look down on us Luddites who have the gall to ask questions, but you seriously can’t see any potential issue with this technology without some sort of restrictions in place?

        You can’t see why people are a little hesitant in an era where massive international corporations are endlessly scraping anything and everything on the Internet to dump into LLM’s et al to use against us to make an extra dollar?

        You can’t see why people are worried about governments and otherwise bad actors having access to this technology at scale?

        I don’t think these people should be locked up or all AI usage banned. But there is definitely a middle ground between absolute prohibition and no restrictions at all.

        •  barsoap   ( @barsoap@lemm.ee ) 
          link
          fedilink
          2
          edit-2
          15 days ago

          None of those concerns are new in principle: AI is the current thing that makes people worry about corporate and government BS but corporate and government BS isn’t new.

          Then: The cat is out of the bag, you won’t be able to put it in again. If those things worry you the strategic move isn’t to hope that suddenly, out of pretty much nowhere, capitalism and authoritarianism will fall never to be seen again, but to a) try our best to get sensible regulations in place, the EU has done a good job IMO, and b) own the tech. As in: Develop and use tech and models that can be self-hosted, that enable people to have control over AI, instead of being beholden to what corporate or government actors deem we should be using. It’s FLOSS all over again.

          Or, to be an edgelord to some of the artists out there: If you don’t want your creative process to end up being dependent on Adobe’s AI stuff then help training models that aren’t owned by big CGI. No tech knowledge necessary, this would be about providing a trained eye as well as data (i.e. pictures) that allow the model to understand what it did wrong, according to your eye.

          • I said:

            I don’t think these people should be locked up or all AI usage banned. But there is definitely a middle ground between absolute prohibition and no restrictions at all.

            I have used AI tools as a shooter/editor for years so I don’t need a lecture on this, and I did not say any of the concerns are new. Obviously, the implication is AI greatly enables all of these actions to a degree we’ve never seen before. Just like cell phones didn’t invent distracted driving but made it exponentially worse and necessitated more specific direction/intervention.

    • They mentioned one potential use that I thought has value and that I hadn’t considered. For video conferencing, this could transmit data without sending video and greatly reduce the amount of bandwidth needed by rendering people’s faces locally. I don’t think that outweighs the massive harms this technology will unleash. But at least there was some use that would be legit and beneficial.

      I’m someone who has a moral compass and I don’t like that scammers will abuse this shit so I hate it. But there’s no keeping it locked away. It’s here to stay. I hate the future / now.

      • Wouldn’t you then have to run the AI locally on a machine (which probably draws a lot of power and memory) or use it via cloud (which depends on bandwidth just like a video call). I don’t really see where this technology could actually be useful. Sure, if it is only a minor computation just like if you take a picture/video with any modern smartphone. But computing an entire face and voice seems much more complicated than that and not really feasible for the usual home device.

        • Yeah, it’s not practical right now, but in 10 years? Who knows, we might finally have some built-in AI accelerator capable of running big neural networks on consumer CPUs by then (we do have AI accelerators in a large chunk of current CPUs, but they’re not up to the task yet). The system memory should also go up now that memory-hungry AI is inching closer to mainstream use.

          Sure, Internet bandwidth will also increase, meaning this compression will be less important, but on the other hand, it’s not like we’ve stopped improving video codecs after h.264 because it was good enough - there are better codecs now even though we have the resources to handle bigger h.264 videos.

          The technology doesn’t have to be useful right now - for example, neural networks capable of learning have been studied since the 1940s, even though there would be no way to run them for many decades, and it would take even longer to run them in a useful capacity. But now that we have the technology to do so, they enjoy rapid progress building on top of that original foundation.

        •  barsoap   ( @barsoap@lemm.ee ) 
          link
          fedilink
          2
          edit-2
          15 days ago

          A model that can only generate frontal to profile views of heads would be quite small, I can totally see that kind of thing running on current consumer GPUs, in real time. Near real time is already possible with SDXL-based models with some speedup tricks applied as long as you have a mid-range gaming GPU and those models are significantly more general. It’s not like the model would need to generate spaghetti and sports cars alongside with the head.

      • Also I would argue sending the actual video of what is happening in front of the camera is kind of the entire point of having a video call. I don’t see any utility in having a simulated face to face interaction where neither of you is even looking at an actual image of the other person.

    • You can’t simply not develop a technology. Progress is going to move forward. If they don’t do it, somebody else is going to figure out how. The tools are out there. The math works. Better researchers to do it now and scare us into finding solutions than criminals to develop it first.

    • Other than the obvious malicious uses of this technology, it could be great for multimedia, great for creative control for cast, great for virtual meetings to always look “your best” (as determined by each individual, e.g. clean-cut pristine, and/or preferred gender, and/or favorite anime, etc.). There are also use cases to hear letters spoken by a lost loved one, or replace the Three Stooges with politicians. Tons of “safe” use cases that I am looking forward to.

      • This is a really positive take. I would love to create such an AI of myself in my likeness so that if one day I come to pass before my wife, she could enjoy having that comfort. I imagine it speaking like: while I’m not your husband, here’s what I think he would’ve said.

        Deep faking myself so I don’t have to use my camera in meetings? I would pay for that feature.

        • Entertainment might be pointless to some. I dream of having an on-demand Netflix that will generate whatever type of content I can imagine on demand, or better yet already know my preferences and all I have to do is tell it my mood and it will start playing something I would like.

          •  floofloof   ( @floofloof@lemmy.ca ) 
            link
            fedilink
            English
            3
            edit-2
            15 days ago

            A difference in goals, I guess. Having programs generated just to pander to my existing tastes sounds horrible to me. I want to be challenged and surprised and have my tastes tested and changed in unpredictable ways. I also want to watch stuff that’s written by humans and acted by humans, because there’s a sense of shared life there that there isn’t in an AI-generated video.

    • If something is possible, and this simply indeed is, someone is going to develop it regardless of how we feel about it, so it’s important for non-malicious actors to make people aware of the potential negative impacts so we can start to develop ways to handle them before actively malicious actors start deploying it.

      Critical businesses and governments need to know that identity verification via video and voice is much less trustworthy than it used to be, and so if you’re currently doing that, you need to mitigate these risks. There are tools, namely public-private key cryptography, that can be used to verify identity in a much tighter way, and we’re probably going to need to start implementing them in more places.

    • They’re also releasing a detector, for what it’s worth.

      Yeah, this one seems like it will have more negative applications than positive. Usually you’ll have a lot more content from someone you want to copy for non-deceptive reasons. It’s inevitable all video will be easily fake-able one day soon, but why hasten it?

  • The actual research page is so awkward. The TLDR at the top goes:

    single portrait photo + speech audio = hyper-realistic talking face video

    Then a little lower comes the big red warning:

    We are exploring visual affective skill generation for virtual, interactive characters, NOT impersonating any person in the real world.

    No siree! Big “not what it looks like” vibes.

  • Someone help me out please. Who was the 90s sci-fi author who predicted actors would go away and all movies would be made using cgi /ai? She had characters in the book, watching movies starring Humphrey Bogart and John Wayne, as detectives solving crimes (and so on). She also predicted “ractors”, people who act in front of a camera, so a computer can use their motion and expressions to animate a character on screen in real time.

    My feeble brain, I swear… In any case, thanks to her, knew this day was coming. Gonna be a wild ride though.

    • According to Le Chat,

      The author you’re thinking of is Neal Stephenson, and the book is “Snow Crash” published in 1992. In the book, he coined the term “ractors” for actors who perform in front of motion-capture cameras to create lifelike animations. He also predicted the use of CGI and AI in filmmaking to create movies with long-dead actors.

      I haven’t read it and the Wikipedia article doesn’t seem to mention virtual actors, so it could be wrong. At least it didn’t hallucinate a fake book.

      • Oh snap, thanks - I was mixing up The Diamond Age with another book, yes. Ractors are from Stephenson, but I also had another author’s books in my head. See? Feeble mind. There’s still another woman author I need to track down and re-read here.

      •  Gamma   ( @GammaGames@beehaw.org ) 
        link
        fedilink
        English
        2
        edit-2
        16 days ago

        I asked Perplexity with “What is the scifi book from the 90s that had “ractors,” where a person would act in front of a camera and a computer would animate a CG model?”and got (what other commenters are saying) is the correct answer:

        The science fiction book from the 90s that featured “ractors,” where a person would act in front of a camera and a computer would animate a CG model, is not directly mentioned in the provided search results. However, based on the description of “ractors” and the context of computer animation and CG models, it seems you might be referring to “The Diamond Age” by Neal Stephenson, published in 1995. In this novel, the term “ractor” (short for “interactive actor”) is used to describe performers who participate in interactive theater through virtual reality environments, which could align with the concept of acting in front of a camera to animate a CG model. However, since this specific detail is not found in the search results, this answer is based on existing knowledge outside of the provided sources.

    • And that’s the problem. The average person isn’t looking for it, and will absolutely not see it. As long as it’s good enough, that’s all that matters. A plausible enough video of Joe Biden talking about rounding up Christians into internment camps that gets shared on Facebook, or something like that which panders to right-wing bigotry, is enough to get people going. Even real images and videos that are miscaptioned are enough, and even when a link is there that disproves the caption.

      People seriously underestimate just how horrifying the possibilities are with this shit. And as high stakes as this election cycle is, and the state of politics in this country, the tendency for people to latch on to anything that affirms their preexisting ideals creates a fucking minefield

      • This is an education problem as much as – if not moreso than – a tech problem. Before the GOP gutted critical thinking wherever they held a majority and two generations were able to grow up under those circumstances, a video of any current president rounding up Christians would have been roundly rejected as either satirical or disinformation by the vast majority of the population, owing to the absurdity of the idea.

        Once we got to the point of a not-insignificant minority of the population believing that the true power in the United States lies in the basement of a pizza shop with no basement …

      • I’ve seen far more convincing deepfakes, to the point I couldn’t tell until I was told. I’ve experimented with this myself. After a bit of trial and error, almost anyone can easily create shockingly convincing deepfakes. One interesting method is using 3D rendered characters with deepfake faces.

  • I think this has an effect most people don’t think of: Media will just lose it’s value as a trusted source for information. We’ll just lose the ability of broadcasting media as anything could be faked. Humanity is back to “word of mouth”, I guess.

  • Yeah Microsoft isn’t releasing this until we can use it responsible.

    1. we’ll never be able to guarantee that. There will always be people abusing this.

    2. Though right now it’s in the hands of Microsoft and likely requires a shit tonne of hardware to run (I’d imagine a collection of specialized servers), this tech WILL come out eventually, and eventually, everyone will be able to run it.

    3. I give it 5-10 years tops before anyone can just do this with anyone. Want to make a movie of trump or Hilary fucking a donkey? Done. Want to make a video of your 5 year old daughter in a gangbang? Done. The future is very bleak.

    I’m honestly unsure if the internet was a good idea and I’m even less sure if humanity was a good idea.