• 1 Post
  • 19 Comments
Joined 2 years ago
cake
Cake day: June 29th, 2023

help-circle

  • If you find that OCR doesn’t get you very far, maybe try a small vLM to parse PNGs of the pages. For example, Nanonets OCR will do this, although quite slow if you don’t have a GPU. It will give you a Markdown version of the page, which you can then translate with another tool.

    PaddleOCR might also be useful, since it focuses on Chinese, but it’s more difficult to set up. To add to this, some other options are MinerU and MistralOCR (this is paid, but you can test it for free if you upload it in Mistral’s library).






  • For notes, I have moved to Joplin with the option to synchronize my data using a WebDAV server. It works really well, and it has both a mobile and desktop app. If you’re interested in developing your project, maybe you can have a look at the options this provides. For example, I really like the ability to separate notes between groups, assign tags, create drawings, and the possibility to use Markdown.

    Good luck with your projects! To mirror @enemenemu’s suggestion, I would also look into collaborating with the people trying to push the EU Docs alternative. Not sure if that will work, but it’s worth a shot if you’re interested :D



  • It’s a bit short-sighted to say that Trump is the one calling in shots here, specifically to weaken the US. It is pretty clear that he is following the plan put forward by the Heritage Foundation word by word. If I understood correctly, the idea is to make the American economy more resilient at the expense of all of its (poor) citizens. Once that is done, they can then leverage their safe zone to further influence policies in other countries. For example, get the EU to lower regulations, so American companies can extract more wealth.

    Here is a quote from the actual “Project 2025 Mandate for Leadership” PDF:

    Needed reforms

    […]

    Increase allied conventional defense burden-sharing. U.S. allies must take far greater responsibility for their conventional defense. U.S. allies must play their part not only in dealing with China, but also in dealing with threats from Russia, Iran, and North Korea.

    1. Make burden-sharing a central part of U.S. defense strategy with the United States not just helping allies to step up, but strongly encouraging them to do so.
    2. Support greater spending and collaboration by Taiwan and allies in the Asia–Pacific like Japan and Australia to create a collective defense model.
    3. Transform NATO so that U.S. allies are capable of fielding the great majority of the conventional forces required to deter Russia while relying on the United States primarily for our nuclear deterrent, and select other capabilities while reducing the U.S. force posture in Europe.
    4. Sustain support for Israel even as America empowers Gulf partners to take responsibility for their own coastal, air, and missile defenses both individually and working collectively.
    5. Enable South Korea to take the lead in its conventional defense against North Korea.

    […]

    They are engineering most of these situations that we’ve seen in the media specifically to make the ideas more digestible to the average population. See the Zelenskyy case: “This is going to be great television” - the guy is not even hiding it.

    On one hand, Taiwan is right to say that the US won’t abandon them. The US does not produce enough chips locally to just let them get gobbled up by China. However, this sort of “theatrics” is not over, and they will come up with a reason to scare Taiwan into investing a lot more in defence, specifically to prepare them for a fight to destabilize China.

    It’s truly sad that this administration is now in power to push these ideas. The average American is going to become much poorer and hateful due to all protections previously put in place being dismantled. Hopefully people wake up and kick them out of office, but the damage done to foreign relationships is already done.



  • Piracy. I’d buy albums if I had money, though. I’ll slowly phase into getting them once I get some more cash.

    I can find most stuff I listen to, and I rarely grow my music library. I mostly listen to 20-30 albums, with some more mainstream music peppered in.

    My music library currently sits at 90 gigabytes (mostly flacs), so quite small compared to others I’ve seen around here. Still, I have plenty of variation to keep me entertained :D

    If you have Tidal, aren’t there some apps to rip the lossless audio from there? You could get most of the stuff that you need, and then cancel the subscription. If you feel bad, maybe order some merch from the band, haha.


  • Click for longer opinion

    If I remember correctly, even though Fuchsia is used in production, it is mainly targetting mobile or IoT devices. Nevertheless, the underlying micro-kernel, Zircon, is written in C/C++, which differs from Redox. Now, I’m not saying that Redox solves everything by writing the kernel in Rust. It will require plenty unsafe blocks to achieve what it needs, but it makes you aware beforehand that you should be careful about how you implement that bit of code. Having this clear marking could also make the kernel code review process more likely to catch issues.

    Disregarding this, if I am not mistaken, Redox aims to be a drop-in replacement for Linux one day, both for desktop and server, while Fuchsia only wishes to be integrated in/replace Android. Linux is perfectly fine for most use cases, I am not suggesting otherwise! However, given how many issues resulted from overflow/memory corruption issues that could have been potentially easier to identify if Rust (or any other memory safe language) was used, you’d think that there is incentive to rely on it for kernel development. Linus himself made this decision as well when allowing Rust to be used in the Linux kernel development (albeit perhaps a bit too early).

    The Linux kernel is not flawed, and Redox is probably years away from being even near it. However, having memory-safety from the get-go as a requirement for developing the kernel could lead to fewer exploits, compared to what we have today with Linux. Just as you’ve said, most users are not aware of it/they don’t care, but the big players will care about keeping information safe on their servers. Just to conclude, Redox OS is not just Linux rewritten in Rust, and could potentially have many other benefits that are particularly juicy for data centers. Too bad it’s not production ready yet :D



  • I see your point. However, integrating Rust properly in the Linux kernel is an uphill battle. Redox OS is not at all close to being stable, but it showcases that you can build a Rust kernel from scratch, and integrate it into an OS that meets some of the requirements of a modern one. Of course, considering it a toy project and glancing over its potential doesn’t help with adoption. They even mention in their description that currently they can only support a community manager and a student developer with the current donations. When you compare that to the amount of money and developers involved in the Linux kernel, it’s insignificant.

    I was not suggesting that the Rust For Linux devs jump ship, but it could be beneficial for the investors behind the project to look at alternatives. Heck, the Linux kernel started as a toy project itself. I believe that a team focused solely on such a Rust-only kernel could spearhead needed changes to reach something stable, as opposed to investing time and money into fighting established C developers to integrate a memory-safe language in the kernel fully.



  • If I am not mistaken, the difference was that the Internet Archive was distributing books with a DRM that would make the PDF unusable after a certain time. You could relate it to how a physical library offers books for a limited time, for free. Now, of course, one could bypass the DRM or copy the contents differently, but so can another person photocopy a book they borrowed physically. Meanwhile, other physical libraries are allowed to distribute e-books, but I’m not sure if that’s made possible due to licensing fees.

    I’m not saying that they approached this well, especially given the copyright laws in the US, but it was indeed a good thing for the normal person at the time. Too bad that the judicial system in the US is biased towards leeching companies. I really can’t wait to see the AI vs publishers fight, though. Let’s see who has deeper pockets and better plants in the courts :D


  • What db2 already said. Microsoft just released Phi-3 mini, which could, allegedly, run locally on newer smartphones.

    If I understood correctly, the Rabbit thingy just captures your information locally and then forwards it to their server. So, if you want more power, you could probably do the same by submitting the same info to a bigger open source model than Phi-3, like Llama 3, hosted on your homelab. I believe you can set it up with huggingface/gradio, which sort of provides an API that you could use.

    That way, you don’t need a shitty orange box, and can always get the latest open source models with a few lines of code. There are plenty of open source frameworks in the works at the moment, and I believe that we’re not far off from having multi-modal LLMs running on homelab-level hardware (if you don’t mind a bit of lag).