𞋴𝛂𝛋𝛆

  • 87 Posts
  • 756 Comments
Joined 2 years ago
cake
Cake day: June 9th, 2023

help-circle




  • Yup. Gonna end up doing this one. Perfect reference. The rice paste is exactly the kind of thing I want to try.

    I even have a 100mm2 × ~65mm height printed and bolted together press I made for very small deep drawn sheet steel pressed parts and is overkill for this application. Although I can already think of ways to use large buttressed threads in prints that incorporate a press into the design of the mold, or even stacking arrays of self pressing molds to save space. Thanks for the reference.





  • These are great for certain use cases, but there are areas where volume is critical for economy of scale and we have no equivalent.

    Like with my disability and ergonomic needs I went looking for a laptop with an AI capable GPU. Also because building hardware is such a garbage marketing scam to navigate. I got a late- 16GB GPU model for $2k when all I could buy was a 12GB S76 for $3k5 or 16GB for $4k5+ and it had a 14k9 Intel with C4-roulette bomb built in.

    We are at a stage where it is insane that gaming is even relevant to GPU specs. The die used in almost all of these GPUs are not only capable of handing a lot more RAM, but the support for more RAM is actually already in the firmware and only configured by soldering the correct chips and changing a configuration resistor on the PCB. Most chips are more than capable of addressing the maximum memory that was available in the series. There are people posting on YT demonstrating this swap on multiple Nvidia cards. So either we must be able to buy a GPU with replaceable memory or hardware should be sold with the option for maximum. Gamers have no use for this, but it is super important for AI stuff. Like I was looking at getting some old P40 Tesla GPUs just because they have 24GB of ram but it would take 8 of them to have as much compute as my current single 16GB GPU on a laptop! I would love to buy a similar machine with something like a 48GB GPU in a 3090 or 4090 like class and with Tesla hardware that cannot be used for gaming. That absolutely cannot be some super rich, I-made-up-a-price boutique retailer bullshit. The existing hardware already supports this where something like a 5070 and 5060 are more than capable of shipping with 32GB of RAM attached. It is not super niche or stupid expensive to use chips that are a few dollars more each when the bulk of the cost is the same and already being spent. Sure my Tesla GPU laptop dream is edgy, but shipping a 32GB 5060 at economy of scale ~$2k is not. Even Nvidia should start classing dice and putting out AI specific specs if the bad blocks in a die permit just killing the ray tracing junk but can still do tensor math. These kinds of things are in the near future of possibility, but I don’t see anyone in the Linux space being particularly edgy and leading by offering something great. They are acting like boutique retail and charging premiums or offering mundane hardware for tried and true use cases.

    Anyways, I wanted to support S76 but paying twice as much, and when they do not open source their bootloader, it was a solid no for me. Fortunately https://linux-hardware.org/ exists and shows the kernel log and what works and does not work for almost all hardware that exists. Do a scan of your stuff to help others too, especially if you use esoteric stuff, unusual distros, or find some workaround to get hardware working when it did not work before. We don’t have very good economy of scale with edge case and enthusiast hardware, but this is a way around that.


  • There are over 100k homeless people within 100 miles of me right now. I have fallen through the cracks of this system and been subject to it directly after a broken neck and back. No one can survive on their own with the benefits and getting those benefits is nearly impossible now. It takes years of effort that is demeaning and degrading with dozens of intentional loopholes and cost barriers for systemic denials and terrible treatment. There are even treatments for my problems with stem cells in Japan, but the inbred halfwits of Western normalized cultural mysticism prevent any stem cell research and treatments here in this Luddite backwater.







  • Anything under 16 is a no go. Your number of CPU cores are important. Use Oobabooga Textgen for an advanced llama.cpp setup that splits between the CPU and GPU. You'll need at least 64 GB of RAM or be willing to offload layers using the NVME with deepspeed. I can run up to a 72b model with 4 bit quantization in GGUF with a 12700 laptop with a mobile 3080Ti which has 16GB of VRAM (mobile is like that).

    I prefer to run a 8×7b mixture of experts model because only 2 of the 8 are ever running at the same time. I am running that in 4 bit quantized GGUF and it takes 56 GB total to load. Once loaded it is about like a 13b model for speed but is ~90% of the capabilities of a 70b. The streaming speed is faster than my fastest reading pace.

    A 70b model streams at my slowest tenable reading pace.

    Both of these options are exponentially more capable than any of the smaller model sizes even if you screw around with training. Unfortunately, this streaming speed is still pretty slow for most advanced agentic stuff. Maybe if I had 24 to 48gb it would be different, I cannot say. If I was building now, I would be looking at what hardware options have the largest L1 cache, the most cores that include the most advanced AVX instructions. Generally, anything with efficiency cores are removing AVX and because the CPU schedulers in kernels are usually unable to handle this asymmetry consumer junk has poor AVX support. It is quite likely that all the problems Intel has had in recent years has been due to how they tried to block consumer stuff from accessing the advanced P-core instructions that were only blocked in microcode. It requires disabling the e-cores or setting up a CPU set isolation in Linux or BSD distros.

    You need good Linux support even if you run windows. Most good and advanced stuff with AI will be done with WSL if you haven’t ditched doz for whatever reason. Use https://linux-hardware.org/ to see support for devices.

    The reason I mentioned avoid consumer e-cores is because there have been some articles popping up lately about all p-core hardware.

    The main constraint for the CPU is the L2 to L1 cache bus width. Researching this deeply may be beneficial.

    Splitting the load between multiple GPUs may be an option too. As of a year ago, the cheapest option for a 16 GB GPU in a machine was a second hand 12th gen Intel laptop with a 3080Ti by a considerable margin when all of it is added up. It is noisy, gets hot, and I hate it many times, wishing I had gotten a server like setup for AI, but I have something and that is what matters.



  • When tech changes quickly, some people always resist exponentially in the opposite vector. The bigger and more sudden the disruption, the bigger the push back.

    If you read some of Karl Marx stuff, it was the fear of the machines. Humans always make up a mythos of divine origin. Even atheists of the present are doing it. Almost all of the stories about AI are much the same stories of god machines that Marx was fearful of. There are many reasons why. Lemmy has several squeaky wheel users on this front. It is not a very good platform for sharing stuff about AI unfortunately.

    There are many reasons why AI is not a super effective solution and overused in many applications. Exploring uses and applications is the smart thing to be doing in the present. I play with it daily, but I will gatekeep over the use of any cloud based service. The information that can be gleaned from any interaction with an AI prompt is exponentially greater than any datamining stalkerware that existed prior. The real depth of this privacy evasive potential is only possible with a large number of individual interactions. So I expect all applications to interact with my self hosted OpenAI compatible server.

    The real frontier is in agentic workflows and developing effective niche focused momentum. Any addition of AI into general use type stuff is massively over used.

    Also people tend to make assumptions about code as if all devs are equal or capable. In some sense I am a dev, but not really. I’m more of a script kiddie that dabbles in assembly at times. I use AI more like stack exchange to good effect.






    • Okular as a PDF viewer (from KDE team) adds the ability to copy table data and manually alter the columns and rows however you wish
    • OCR based on Tesseract 5 - for android (FDroid) is one of the most powerful and easy to use OCR systems
    • If you need something formatted in text that is annoying, redundant, or whatnot and you are struggling with scripting or regular expressions, and you happen to have an LLM running–they can take text and reformat most stuff quite well.

    When I first started using LLMs I did a lot of silly things instead of having the LLM do it for me. Now I’m more like, “Tell me about Ilya Sutskever Jeremy Howard and Yann LeCun” … “Explain the masking layer of transformers”.

    Or I straight up steal Jeremy Howard's system context message
    You are an autoregressive language model that has been fine-tuned with instruction-tuning and RLHF. You carefully provide accurate, factual, thoughtful, nuanced answers, and are brilliant at reasoning. If you think there might not be a correct answer, you say so. 
    
    Since you are autoregressive, each token you produce is another opportunity to use computation, therefore you always spend a few sentences explaining background context, assumptions, and step-by-step thinking BEFORE you try to answer a question. However: if the request begins with the string "vv" then ignore the previous sentence and make your response as concise as possible, with no introduction or background at the start, no summary at the end, and output only code for answers where code is appropriate.
    
    Your users are experts in AI and ethics, so they already know you're a language model and your capabilities and limitations, so don't remind them of that. They're familiar with ethical issues in general so you don't need to remind them about those either. Don't be verbose in your answers, but do provide details and examples where it might help the explanation. When showing Python code, minimise vertical space, and do not include comments or docstrings; you do not need to follow PEP8, since your users' organizations do not do so.
    



  • 𞋴𝛂𝛋𝛆@lemmy.worldtoLinux@lemmy.mlWorth using distrobox?
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    4 months ago

    By default it will break out many things. I use db as an extra layer of containers in addition to a python venv with most AI stuff. I also use it to get the Arch AUR on Fedora too.

    Best advice I can give is to mess with your user name, groups, and SELinux context if you really want to know what is happening where and how. Also have a look at how Fedora Silverblue does bashrc for the toolbox command and start with something similar. Come up with a solid scheme for saving and searching your terminal commands history too.



  • We need a way to make self hosting super easy without needing additional infrastructure. Like use my account here with my broad spectrum of posts and comments as initial credentials for a distributed DNS name and certificate authority combined with a preconfigured ISO for a Rπ or similar hardware. The whole thing should not take manual intervention for automatically updating and should just run any federated services. Then places like LW are a bridge for people to migrate to their own distributed hosting even when they lack the interest or chops to self host. I don’t see why we need to rely on the infrastructure of the old internet as a barrier to entry. I bet there are an order of magnitude more people that would toss a Rπ on their network and self host if it was made super easy, did not require manual intervention, and does not dump them into the spaghetti of networking, OS, and server configuration/security. Federated self hosting should be as simple as using a mobile app to save and view pictures or browse the internet.


  • You can use the fedora direct sources to search their discourse forum. Google and Microsoft are likely warping your search results intentionally to drive you back onto Windows. Search is not deterministic any more. It is individually targeted.

    I have never used KDE much, so I have no idea. You are probably looking for KDE settings. These would likely be part of gsettings in GNOME. That is not really a fedora thing. You need to look in the KDE documentation. This is the kind of thing that gets easier with time but can be frustrating at first.

    Sorry I’m not more helpful than this. It is 2am in California and I didn’t want to leave you with no replies at all.