

Regular users can use Gemini, Deepseek, Meta AI, and there will probably be many more services in the future.
Regular users can use Gemini, Deepseek, Meta AI, and there will probably be many more services in the future.
I don’t doubt there was some localized fraud going on. I do doubt fraud was responsible for the red-shift in the vast majority of counties as shown on this map: https://www.nytimes.com/interactive/2024/11/06/us/politics/presidential-election-2024-red-shift.html
I think they tried to steal the election, but didn’t need to. I do wish there were more investigations, because they’re probably going to do all the same stuff and much more in the next elections.
Could use it kind of like an extra monitor with something like Barrier.
Could use it like a home assistant for a kitchen or something, but I don’t know if there’s any good privacy respecting software for that ATM (looks like MyCroft went bankrupt).
I used an old laptop I had laying around for controlling a Maslow CNC. Could also use a laptop to run OctoPrint or something.
Yeah, I was disappointment when I bought a very expensive Galaxy S22 to replace my old Moto G whose charging port wore out,. The S22 had worse battery life, camera, and no noticeable performance improvements. Recently, my S22 stopped charging, and I just bought a “Mint”-grade used Pixel 6 and installed GrapheneOS on it. Happy so far, and it’s nice to be able to block network access to all apps, including Google’s.
I’m curious if ByteDance could just create a new legal entity and call it TikTak or something.
Hmm. Looks like that was in Texas too. https://truthout.org/articles/a-city-in-texas-just-put-10000-bounties-on-trans-people-using-the-bathroom/, and they’re going to pass quite a few more bounty laws yhis year: https://prismreports.org/2025/01/08/bounty-laws-texas-trans-rights-abortion/
Dunno, they’d probably have a hard time suing European instances, but they can’t outright block, as that would be unconstitutional. U.S. states have recently been using lawsuits to get around constitutionality. I.e. Texas also has a “bounty” law, where if you know a woman went out of state to get an abortion, you can report it, and the state will sue them and give you $10,000. I think another state has a similar law for if you see a trans person using a restroom that doesn’t match the genitalia they were born with.
With the current laws on the books, Texas could probably sue Lemmy instances because they contain pornographic content and they don’t verify users’ identity.
If you have to verify children’s identity, you have to verify everyone’s identity. This is part of KOSA. https://www.eff.org/deeplinks/2024/12/kids-online-safety-act-continues-threaten-our-rights-online-year-review-2024
That’s really cool (not the auto opt-in thing). If I understand correctly, that system looks like it offers pretty strong theoretical privacy guarantees (assuming their closed-source client software works as they say, with sending fake queries and all that for differential privacy). If the backend doesn’t work like they say, they could infer what landmark is in an image when finding the approximate minimum distance to embeddings in their DB, but with the fake queries they can’t be sure which one is real. They can’t see the actual image either way as long as the “128-bit post-quantum” encryption algorithm doesn’t have any vulnerabilies (and the closed source software works as described).
I’m loading up on vacuum tubes.
Last time I looked it up and calculated it, these large models are trained on something like only 7x the tokens as the number of parameters they have. If you thought of it like compression, a 1:7 ratio for lossless text compression is perfectly possible.
I think the models can still output a lot of stuff verbatim if you try to get them to, you just hit the guardrails they put in place. Seems to work fine for public domain stuff. E.g. “Give me the first 50 lines from Romeo and Juliette.” (albeit with a TOS warning, lol). “Give me the first few paragraphs of Dune.” seems to hit a guardrail, or maybe just forced through reinforcement learning.
A preprint paper was released recently that detailed how to get around RL by controlling the first few tokens of a model’s output, showing the “unsafe” data is still in there.
I think TikTok appeased the right by changing their algorithm. Charlie Kirk is apparently doing extremely well on the platform now.
I use GPT (4o, premium) a lot, and yes, I still sometimes experience source hallucinations. It also will sometimes hallucinate incorrect things not in the source. I get better results when I tell it not to browse. The large context of processing web pages seems to hurt its “performance.” I would never trust gen AI for a recipe. I usually just use Kagi to search for recipes and have it set to promote results from recipe sites I like.
Tor for browsing is similar to a VPN. I2p and Tribbler for downloads is also similar. You could also just rent a cheap VPS and set up your own VPN. There’s a high chance people will be doing illegal shit through a VPN-like services, so I don’t think a p2p VPN-like service where everyone is like an exit node is viable.
Hmm. I just assumed 14B was distilled from 72B, because that’s what I thought llama was doing, and that would just make sense. On further research it’s not clear if llama did the traditional teacher method or just trained the smaller models on synthetic data generated from a large model. I suppose training smaller models on a larger amount of data generated by larger models is similar though. It does seem like Qwen was also trained on synthetic data, because it sometimes thinks it’s Claude, lol.
Thanks for the tip on Medius. Just tried it out, and it does seem better than Qwen 14B.
Larger models train faster (need less compute), for reasons not fully understood. These large models can then be used as teachers to train smaller models more efficiently. I’ve used Qwen 14B (14 billion parameters, quantized to 6-bit integers), and it’s not too much worse than these very large models.
Lately, I’ve been thinking of LLMs as lossy text/idea compression with content-addressable memory. And 10.5GB is pretty good compression for all the “knowledge” they seem to retain.
Oh, I forgot about Claude. Last time I tried it, it seemed on par or even better that ChatGPT-4o (but was missing features like browsing).