audiodude 13 hours ago

Volunteer for Kiwix here (https://kiwix.org), we do a lot of offline Wikipedia stuff. I've personally worked on MWOffliner (https://github.com/openzim/mwoffliner) which scrapes MediaWikis, primarily Wikipedia.

We have apps for basically every platform. Our PWA even supports IE 11!

You can use the WP1 tool which I'm the primary maintainer of (https://wp1.openzim.org/#/selections/user) to create "selections" which let you have your own custom version of Wikipedia, using categories that you define, WikiProjects, or even custom SPARQL queries.

  • strofocles 9 hours ago

    May I suggest somebody out of your company reviews the website. It is not clear to me what you do, what the apps do and so on. The copy is also kind of abstract "we make the world a better place" type of copy. From your comment I understand you do good work and would be a shame for people new to your products to struggle understanding what are you doing.

    • bcraven 9 hours ago

      I don't agree with your assessment. Did you find the 'About Us' page insufficient?

      • blazerunner 7 hours ago

        I don't think the previous comment was trying to be snarky - I can see where they're coming from.

        Take my feedback with a grain of salt, as I am entirely not the target audience, but...

        "Stay Connected, Always" - weird way to put it, given it's for offline situations. At this stage it sounds like it could be a 4G or portable wifi solution?

        "Use our apps for offline content or the Kiwix Hotspot for reliable access." - so it's probably a desktop or mobile app, maybe a web app. What is Kiwix Hotspot, another app? Unclear there is a hardware at this point, or any on the home page unless I watch the video that hints at it.

        The summary in the footer was a lot clearer to me: "Kiwix is an offline reader for online content like Wikipedia, Project Gutenberg, or TED Talks. It makes knowledge available to people with no or limited internet access"

        Again, not trying to complain for the sake of it, I think this is a cool project helping under-served communities, but if people can't easily understand what you do, they may not dig deeper.

        If I can't tell what is being offered without much thinking or digging, the home page isn't doing as much as it could be.

        Perhaps it is ticking the boxes for your target audience if you have done some testing. Great! If not, some quick user testing could help optimise the messaging to make sure what you offer is landing.

        • eru 6 hours ago

          The quote from the footer sounds pretty good. That should probably be front and centre.

          • westpfelia 4 hours ago

            agreed. It hits the nail on the head.

        • ehutch79 18 minutes ago

          I honestly thought this was a self contained relay to provide internet access to remote locations...

      • Angostura 5 hours ago

        The open sentence of the ‘About Us’ page is very good. The problem is, it is on the about us page, rather than front and centre on the home page

  • freedomben 13 hours ago

    Neat, thanks! I'm CTO of Ameelio (non-profit) and have been eyeing Kiwix for awhile. Getting content to incarcerated people is a unique challenge due to the exceptional security requirements, and an offline solution like kiwix might fit in well. Being able to narrow down categories is a huge capability for us. Thank you!

    • benoitberaud 29 minutes ago

      Feel free to reach to us (Kiwix), we've already helped NGO deploy our content to prisons for the exact reason you mention.

    • gehwartzen 13 hours ago

      Just wanted to comment on what a great mission Ameelio seems to have! Glad you guys are helping some of the most unseen in our society. Kudos!

  • flipgimble an hour ago

    If I'm reading this right, the last full zim archive of all of english wikipedia is wikipedia_en_all_maxi_2024-01.zim which is now about 16 months old. Is that right, or is are another more recent sources?

    The current US administration is actively trying to interfere with Wikipedia and censor public speech or information that is detrimental to their disinformation campaign. [1]

    Do you know if there is an effort to publish more recent archives ? Or do you have any advice how outside developers could jump in to help with that project?

    [1] - https://news.ycombinator.com/item?id=43799302

    • prepperdisk an hour ago

      Kiwix team is close on this, it’s even a partnership directly with Wikipedia to work on the newer APIs and function reliably.

  • yreg 8 hours ago

    Regarding mwoffliner: Why scrape Wikipedia when you can just download a dump?

    • detaro 6 hours ago

      If you want to test Mediawiki tooling, wikipedia is good test target, because it uses a lot of the features (unsurprisingly), compared to smaller wikis. (OTOH, the latter often have custom extensions, so it's not quite enough)

      • yreg 6 hours ago

        Sure, but I understood the parent as saying that the tool primarily serves for scraping Wikipedia.

    • dal 7 hours ago

      I was thinking the same. It must take much less space in database form than all the html pages.

      • josephg 7 hours ago

        Its also kind of bad form to scrape a huge website when there's a downloadable dump available. Save yourself, and more importantly wikimedia, a whole lot of bandwidth & CPU cycles.

  • hoseja 7 hours ago

    I had an offline copy of wikipedia from like five years ago, just in case. When I recently needed it I opened the kiwix app and everything was broken by some godforsaken overhaul update. I don't have an offline copy of wikipedia on my new phone anymore.

katabasis an hour ago

I'd love to see something like this that is marketed to parents or schools who want to give kids a way to access the good parts of the internet (Wikipedia, e-books, etc) without the toxic parts. Maybe throw in some kind of de-centralized social media or file sharing for collaboration with other students or like-minded families. Kiwix + Pi-hole + Activity Pub basically. The device could create its own network (that only allowed access to an allowed list of sites, defaulting to a list of educational projects).

If no one produces such a device I may need to make one myself by the time my toddler gets old enough to go online.

  • Rediscover 11 minutes ago

    Didn't England or GB do (does?) a curated dump precisely for that purpose? I'll pull up the info tonight if no one else chimes in.

  • xandrius 17 minutes ago

    I'd definitely support this (even help). My non-related wish for my future kids would be to have some modern version of the BASIC games you would have to type in manually.

    I never did that but it sounds a really fun to get into technology and programming. The difference is that I'd use different languages and maybe also allow them to draw images (with paint) and such. Everything offline and simple.

  • daniel_iversen an hour ago

    You could maybe install something like NextDNS? (even the free version might cut it) - I think that can block major categories including. the ones you mention.

Brajeshwar 14 hours ago

I’ve encountered a few instances (mostly friends) who have done similar ideas. Around 2007 - 2012 (ish) many corners of India were yet to have access to the Internet (Jio hasn't introduced super cheap, almost free Internet). A few friends did like download Wikipedia and take it to these corners and teach kids.

In my case, a friend/colleague Freeman Murray,[1] had that idea and I told him I will try in my hometown (one of the most remote corner of India). We did and I got a few young kids to be the maintainer, have a few desktop (not Laptop) that they carry around and watch videos to learn to program. It was good while it lasted. Now, those isolated places that I was scared to go alone when I was a kid have fiber Internet connections.

On a fun note, I do have a picture of an "Internet in a Box". This was Detroit in the mid 2000s. https://www.flickr.com/photos/brajeshwar/113742187/in/album-...

1. https://www.mars.college

dragonw 6 hours ago

This could be helpful even for people who already have internet access. The internet is full of distractions, and we often end up seeing ads instead of actually learning something. It would be great to create personal collections of useful websites and have a powerful search tool to explore them—without relying on Google and getting sidetracked.

Right now, Kiwix’s search function is quite basic and doesn’t work well with large amounts of content. It might be worth exploring the use of a generative language model or embedding model to improve its accuracy and usefulness.

  • rollcat 6 hours ago

    These devices are insanely underpowered for running LMs. We've had pretty decent search long before the current hype wave. Postgres has built-in support for FTS in multiple languages[1]; if you want to get fancy you can maybe throw in a table of common synonyms.

    [1]: https://www.postgresql.org/docs/17/textsearch-intro.html

    Otherwise agreed.

    • RenThraysk 5 hours ago
      • rollcat 5 hours ago

        Yeah that's $350 vs $35 - just to run text search?

        • RenThraysk 4 hours ago

          I don't think determining factor is price, it is how easy or hands off the appliance can be.

          Every postgres search implementation has atleast one internet connected postgres nerd on standby.

          • rollcat 3 hours ago

            > I don't think determining factor is price

            This is a solution that is supposed to be affordable in third-world countries, where 99% of all personal computing happens on low-to-mid-end smartphones.

            Unless you specifically mean first-world countries, in which case you can just run a hotspot from your PC or laptop.

            > it is how easy or hands off the appliance can be.

            I can already barely justify having a single RasPi plugged into my router, and I'm a devops nerd. That box does one single thing that is too annoying to set up on my Mac mini. 99% of people do not want yet another blackbox appliance, this is why every single SOHO router has a built-in switch and AP.

            > Every postgres search implementation has atleast one internet connected postgres nerd on standby.

            It doesn't have to be postgres specifically, just pointing out that decent FTS has been commodity software for decades. Throwing LMs at every already-solved problem is the problem with LMs.

            • RenThraysk 2 hours ago

              It's not a solved problem for people that aren't technically capable, and do not have internet access.

              It's why Kiwix was developed, so people don't have to run a full web stack (which requires a rdbms) for wikipedia.

              I don't know if LM is the answer and probably not in power deprived areas, but running a rdbms is defintely isn't it.

              • rollcat an hour ago

                > running a rdbms

                I agree with you on every step and every point except the LM. Your entire argument hinges on FTS requiring a fat RDBMS. This is not the case.

                Postgres is what I have experience with, but I just looked up SQLite and indeed it has FTS5[1]. This is not an LM-grade problem, this is a solved problem.

                [1]: https://www.sqlite.org/fts5.html

gioazzi 19 hours ago

Brilliant concept! I recently met the fine folks at Beekee who make something rather similar: https://beekee.ch/beekeebox/

It's an apparently simple problem on the surface, but quite hard to get it right... I once worked on a wireless network deployment for a transit refugee camp, and at least that was built on the assumption that some sort of Internet connection would be available at all times, making remote management possible. And even then it was tough to manage considering all other constraints.

I can only imagine how hard it is to deliver this kind of service reliably when Internet is rarely if ever available.

  • myself248 16 hours ago

    Ooo, there's another one to look at. There seem to be a bunch of these with variously-overlapping goals. Two more I'm aware of:

    https://bibliosansfrontieres.gitlab.io/olip/olip-documentati...

    https://wrolpi.org/

    And I feel like the PirateBox concept is sort of adjacent.

    • nobodyknowin 15 hours ago

      I saw a local agency demo their use of the pirate box for Wildland firefighting.

      They had a GIS team working on mapping updates to fire lines, cut lines, dozer paths, crew assignments, etc. And as required they'd upload everything to the pirate box and the commanders / captains could download the maps to their tablets.

      Amazing stuff all without internet.

  • Bengalilol 17 hours ago

    Did you meet Beekee in Geneva?

    I bet those kind of boxes work very well when there are less than 30 connections at once. All in all, if it is about accessing useful information, I think this is somehow brilliant (as you wrote).

    • gioazzi 10 hours ago

      They are based in Switzerland as I am, but it so happened we met in Doha

      IIRC they support 40 concurrent users, and in their model that would always be a school class, which I guess shouldn’t be larger anyway

  • syncsynchalt 15 hours ago

    The article is about devices that don't use internet access — they provide a shallow copy of wikipedia, learning sites, and the infrastructure for devices to connect and use these as if they were connected to the full network.

ChrisMarshallNY 19 hours ago

I thought they meant this: https://www.youtube.com/watch?v=iDbyYGrswtg

Chuckles aside, it's a cool concept.

jauntywundrkind 17 hours ago

Vaguely interesting, but I am far more interested in actual connective technology.

The Commotion "Internet in a Suitcase" project (~2012) was much more up my alley. Is much more the sort of thing I wish that, for example the State Department would still fund.

> Commotion relies on several open source projects: OLSR, OpenWrt, OpenBTS, and Serval project.

So, mesh, wifi, cellular, and voice technologies, packaged onto semi affordable hardware... That's the real stuff! That's what democratic values should look like, that'a what we could build that would embody our (USA's) founding principles, would fight tyrant info-control.

https://en.wikipedia.org/wiki/Commotion_Wireless

  • yapyap 15 hours ago

    I think that giving the poor(er) people access to just the information that’s accessible on the internet, like Wikipedia, like Khan academy to learn programming. Is so much more helpful than handing them access to “the internet” as in what the internet is for people in the rich parts of the western world.

    Sure, we can hand them access to all of the internet and have them scrolling social media till they’re hollow people and earn money by doing anything cause they have seen the way you can live in luxury and start idolizing that. Or you give them just the useful parts of the internet.

    • freefrog1234 7 hours ago

      I saw many projects while working in Cambodia, including One-Laptop-Per-Child, projects designed to share market information with farmers, etc and none made an impact like mobile phones.

      One project that was semi-successful were USAID sponsored internet cafes that were supposed to enable access to political information just before an election. The USAID staff were annoyed to find most Cambodians used them for international VoIP calls.

      Never assume you know better than the end users what they want from the internet. Now mobile companies move in so fast to conflict countries (from my experience in Afghanistan and Iraq), internet access is up there with electricity on the list of requirements.

    • aspenmayer 6 hours ago

      > Or you give them just the useful parts of the internet.

      Implying you (or anyone) knows better what is most useful to someone else isn’t fair to them. What if the sites that are most useful aren’t included in the backup because of blind spots or license/copyright issues?

      Saving a single page to view offline is simple enough for me, but many non-technical folks would struggle. A whole site? That is a bit harder. Most people could not do this outside manually crawling the site.

      Web archiving tools that are easy to use and allow offline use also have a role to play that other tools can’t fill due to many issues that are outside developers’ ability to change, and putting archiving capabilities in the hands of users directly allows users’ fair use rights to be used to good effect.

flaburgan 17 hours ago

With the crazy news we have those past months, I actually started to wonder what would happen if internet went offline "for real" (let's say, several weeks) here in developed countries. I know we can easily download Wikipedia and Openstreetmap. But what else? And how to share it? I can do a hotspot home, but would my neighbors understand it? I would need some kind of captive portal to tell them when they connect to me. And then, could they repeat the hotspot, to build a mesh? I know there are projects to do that, but what do they accomplish exactly? I remember 10 years ago, in Ubutu, Empathy was allowing me to chat with people connected to the same network than me. No account, no registration. That would be very useful. Does the Pirate box do all of that? How extensible is it?

  • dachris 11 hours ago

    If the internet would go offline "for real" even for days, all hell would break lose. It would not be as bad as electricity going offline, but it's in the same category now. No credit card payments payments (food, gas), no logistics. Prepper stuff.

    The real scenario long-term is that quality content like Wikipedia could either be taken offline, be poisoned by AI or taken over or censored by authoritarians or corporations or interest groups. Like social media and parts of the normal internet already.

    So archiving is good anyways.

    To your actual question - quality non-fiction e-books would be valuable. Wikipedia is a superficial skimming of human knowledge, lots of the real stuff is in books (think medicine, agriculture, algorithms, engineering etc.).

    Practically, a home WIFI has a very limited range, so just handing out sheets with instructions to your neighbors would work. And for a wider mesh network, you'd need to make do with whatever evolves in that scenario.

    • dev0p 2 hours ago

      And a couple of hours later, major blackout in Spain.

      • encipriano an hour ago

        Yeah who knows whats the real cause. Maybe some fire, maybe a cyberattack or could even be someone messed up greatly in their pc because government security is disastrous so who knows

  • Cthulhu_ 7 hours ago

    > would my neighbors understand it?

    I think this identifies a real problem in western society; there's a crisis situation of sorts, but instead of people talking to each other, this comment worries about whether people can figure it out on their own.

    I believe that if there were to be a crisis situation like that, long-term power outage or whatever, people would find each other again instead of the individualism we have right now.

  • int_19h 13 hours ago

    AREDN can give you a mesh running standard internet protocols on fairly cheap hardware - they have firmware that can be flashed on small USB-powered GL-iNet devices such as Beryl:

    https://www.arednmesh.org/content/supported-devices-0

    It's self-configuring, too - as soon as the node spins up, it will automatically find and connect to nearby nodes and start routing.

  • ericb 15 hours ago

    I would add an LLM like QwQ-32B to the mix--that has a ton of compressed knowledge embedded in it.

    I would also store it in a steel Oscar the Grouch style trash can for a cheap faraday cage, which gets you protection from solar flares, and EMP blasts.

    • int_19h 13 hours ago

      LLMs are a bad deal when you look at how much power you need to run that inference. A device that could barely run one instance of QwQ-32B at glacial speeds will be able to serve multiple concurrent users of Kiwix.

      • ericb 13 hours ago

        To serve multiple users, probably not.

        But--if you don't think of asking Hacker News every single thing you need to know beforehand, I think you still want the LLM to answer questions and help you bootstrap it.

    • moffkalast 6 hours ago

      Learning things from scratch is really hard too, just a copy of wikipedia gets one absolutely nowhere if you don't know what to search for.

      Having something that you can plainly ask how to start that will point you in the right direction and explain the base concepts is worth a lot more, it turns raw data into genuine information. Yes it can be wrong sometimes, but so can human teachers and you can always verify, which is a good skill to practice in general.

  • numpad0 3 hours ago

    IMO: this type of prepper offline Internet apps should support Android/Windows and have .apk/.zip download links on front pages so that it can reproduce itself. It should run on something like a random cracked phone charging from USB port on a Wi-Fi router.

  • kilroy123 14 hours ago

    I've gone a step further, I have built a 50 TB NAS and I'm loading up as much as I possibly can.

    • aspectmin 14 hours ago

      I’d love some detail on your setup. Which NAS, which drives? Key data sets? (Would love to build one myself)

      • kilroy123 14 hours ago

        I have 4 Seagate Exos X18 14TB drives crammed into a JONSBO N2 Black NAS ITX Case. 2 TBs of SSDs, and 64 GB of RAM. (Going to double the ram soon)

        • simgt 3 hours ago

          Why is it useful to have as much as 64GB of RAM in a NAS?

          • xandrius 14 minutes ago

            What if you want to load all of Wikipedia and images on the RAM for even faster access of every page?

  • 9x39 14 hours ago

    Too little bandwidth and too few nodes to do this in the sense I think you mean.

    You can build a hotspot and try setting up meshes with any of the available hardware or software packages out there, but you're going to end up being the gatekeeper to the service. HAM radio ends up working out the same way, as I understand. It's just too technical for people to have this spring up collectively without a single person or team doing everything.

    Lack of tech experience to even know how to build a mesh let alone prioritize its limited bandwidth is why the general public isn't going to assist.

    >And then, could they repeat the hotspot, to build a mesh? I know there are projects to do that, but what do they accomplish exactly?

    Yes, pretty much. The problem is poor definition of the problem, though.

    What are we trying to solve? A way to send trickles of comms out, like "Mom and Dad, we're alive?" or "We have life-threatening casualties at x',y'?" Emergency kiosk to send emails one at a time? Doable if you have an Internet source like a Starlink, or any other uplink that's still up somehow.

    Or is to restore the "Internet" as generally known, which might as well be synonymous with YouTube and Netflix and web browsing for people. You and your system would be overwhelmed as soon as your mesh comes up.

    • fc417fc802 9 hours ago

      You might be able to define this particular problem using negatives instead of positives. Everything except stuff like YouTube, Netflix, etc. If you cut out images, audio, video, and other intensive data streams text based communication in natural language is extremely lightweight. Particularly if the protocol is carefully designed with the intended deployment conditions in mind.

      I guess a requirement for that is a sufficiently generalized protocol with a matching hardware stack.

  • pixl97 17 hours ago

    I started in the ISP business and did Wisp stuff for a while so doing this wouldn't be too difficult for me. Hardest thing would be scaling it with the average equipment and user would have.

  • tarruda 17 hours ago

    I wonder if it is possible to have some kind of P2P protocol similar to BitTorrent where one can seed incremental snapshots of subsets of the internet.

    Something like the internet archive, but fully decentralized.

    • metasj 14 hours ago

      Sounds like something r/datahoarders would be / may already be into.

VikingCoder 12 hours ago

So if I bought the $58 one from the Wikipedia Store...

What exact solar products (panels, battery, converter?) would I need to buy, near Chicago, to run one of these 24/7, year round, and let's say it's gotta be up and running most of the time - say, 99% of the time. (That means it can be down over 3 days a year, and still be acceptable to me.)

  • godelski 11 hours ago

    It is running a Pi Zero 2 W, and that should run max draw 2.5A@5V (12.5W)[0]. So a Watt hr is using a Watt for an hour. So the question is how many hours continuously you would allow your Pi to run. For 24hrs, that's 24hrs*12.5W = 300Whrs. Just for an estimate, a shargeek is $100 and will give you 24 Whrs[1]. I'm sure you could build your own solution for much better, but assuming since you're asking, I'm assuming this isn't a great option.

    So probably a bit more expensive than you're thinking. Especially if you're putting it outside, as you'll need to make the thing more secure from weather. But also the good news is you probably aren't going to actually be pulling those 12.5Ws on your pi. You should probably measure and see.

    For solar, I'm not sure especially since you'll need to adjust for your requirements. But there are nice resources that can tell you average capacity, but be careful to note that these will usually show averages and you're going to be significantly affected by seasons in Chicago.

    Honestly, I'd get a small battery (like for a phone) and hook it up to an outlet and tuck it away somewhere. That's a much cheaper option. Even if you're "going rogue" with it... the power draw is so little you won't really notice it.

    [0] https://www.raspberrypi.com/documentation/computers/getting-...

    [1] https://sharge.com/products/shargeek-170

  • pastage 11 hours ago

    This is not for that kind of setup, this is more of a button you press to get internet when you need it IMHO. It is 1 watt idle, so you need 24Wh to keep it running if there is no sun. On bad short days you might get 5% of solar power. A battery system might lose you 30% on that. Uptime is not primarily an technical issue, it depends on what your goals, skills and needs are.

  • varenc 11 hours ago

    A 50Ah 12V deep cycle car battery would give you at most 600Wh. This RPi's max draw is 12.5W, but an average draw is 6W is probably reasonable. So that battery alone would get you at most `600 watt-hours / 6 watts = 100 hours` of run time (minus some voltage conversion loss and other imperfections). Then just hook up some solar to charge the battery and assume at least 25% loss. ~70W should probably be enough even for a Chicago winter. The RPi might actually idle at around 1W making it even easier.

    All that said, there might be a better solution for you if 99% uptime offline personal Wikipedia is your only goal. And IIAB isn't really optimized for high availability I assume. (just get a local copy of Kiwix on your phone+laptop? Or a cheap dedicated tablet for Kiwix would probably cost less than the battery+solar setup)

  • Qqqwxs 11 hours ago

    I'm running a Raspberry Pi based GNSS receiver from a 26 Ah SLA battery and an 80W panel. Just passed 2 weeks of uptime in a cloudy period of southern hemisphere autumn.

    A monte carlo simulation using historical conditions said it had a ~95% chance of no downtime over 3 winter months. A slightly larger battery would bring that up to 99%.

    The Pi (3b+), GNSS reciever (u-blox ZED F9P), and Waveshare 7600G 4G modem average about 3.5W idle. The GNSS reciever is about 0.1 - 0.2 W of that. Wifi would be more energy efficient, I imagine.

    • simgt 2 hours ago

      Is it an RTK base station? If so I'd love to know more about why you set one up.

  • jzemeocala 11 hours ago

    I actually did something just like this with the second raspberry pi and an offline copy of Wikipedia.

    Although I don't know this devices specs I recall being able to reliably power the pi 2b + a 3.5 inch touchscreen with a random 15,000mah solar power Bank from amazon

netsharc 18 hours ago

I remember encountering this project: https://piratebox.cc/faq , I even still have a compatible hardware at home.

I wonder if allowing it to have instant messaging (including offline asynchronous messaging) would change how people in a small community communicate each other. I wonder if, for one, it would induce Internet trolling.

  • blacksmith_tb 17 hours ago

    Doesn't seem likely - the key to trolling is lack of accountability, in a small community everyone would know you were being a jerk?

    • netsharc 16 hours ago

      What if the system allowed anonymous accounts?

      It'd be interesting if one had to go visit an admin (in real life) to get an account, and accounts are really associated with people.

    • squigz 7 hours ago

      Guys... people are jerks IRL too, in small communities too.

    • esseph 16 hours ago

      Some people REALLY like jerks.

  • BLKNSLVR 14 hours ago

    I have one of these, not in operation but could deploy by just plugging into power and attaching it to a higher gain antenna. It does have a message-baord type function included, but it's anonymous and there's no sign-in so there's also no way to administer it or edit / delete your own messages (piratebox is a product of a more innocent age, maybe). It's also possible to upload files to it, anonymously.

    Both of these things make me worry about liability in the event of the type of jerks where the term "jerk" is possibly the nicest way to describe the person.

    (I have it on a GL.Inet Mango device and it took me a lot of digging to find the install binaries and instructions, and I don't even know if said binaries and instructions specific to the Mango exist anymore - I don't have the time / energy / motivation to try to dig it up again, I remember there were lots of trails that led to almost the right information)

FerretFred 7 hours ago

I've set up IIAB and tried Kiwix as well. I used a Pi Zero 2 and as it's only me using it, I found the user experience was good, although given the Zero's power, not high-speed. I managed to gain enough knowledge to create my own ZIM files of some selected YouTube content ( including my own). Being aware of the recent propensity of useful data to "vanish" I've approached a couple of online content providers to see if they'd ask to get their content added to Kiwix: sadly I heard nothing back, but I'll keep trying.

asdefghyk 3 hours ago

Low bandwidth recollections I have Dialup modems , 2400 baud 33K and finally 56K Dialup connections early in morning, say 5 - 6 am would give very noticeably faster webpage loading times Mail client ( like Eudora email client) connect and just load email headers ( or titles) then select which emails I wanted to retrieve the bodies for... This would have been early 90s to early 2000s , when I stopped using dial up connection.

kh_hk 18 hours ago

Seeing the demo I noticed it looks like this "prepper disk" that was submitted days ago https://news.ycombinator.com/item?id=43790409

Makes me think the prepper disk was maybe a rebrand of internet in a box without proper attribution?

  • entropie 17 hours ago

    > PrepperDisk is similar to a DIY, open-source project that started in 2012 called Internet in a Box and which has become popular in rural areas in developing countries where internet access is sparse. The idea is basically that you can carry around an external hard drive-sized, mini version of the internet with you that creates a local network your phone or laptop can access.

    > https://www.404media.co/sales-of-hard-drives-prepper-disk-fo...

    From the hn-thread. You might be right.

    • prepperdisk 15 hours ago

      We are actually IIAB partners, we attribute to all the various OOS projects (Kiwix, IIAB)in our credits and comply with all the licensing. Our goal was just to make those packages polished as a consumer product and add newer content (some licensed and some commissioned ).

      • blacksmith_tb 14 hours ago

        Curious why you went with a 512GB SD card for the Prepperdisk, instead of a usb drive? I guess it might make the enclosure bigger, but every RPi thing I have built has been undone by SD card corruption (unless I used the overlay filesystem).

        • prepperdisk an hour ago

          Good question. Cost and ease of use were the primary drivers. It’s a read-heavy / write-light which means we came expect more life than some use cases. Alternatively we went with nvme on the 1TB/LLM model due to its heavier write profile. Making a backup of the SD card is wise though as a backup.

        • int_19h 13 hours ago

          These things shouldn't need writable FS though.

      • kh_hk 8 hours ago

        And yet there's neither mention of projects over https://www.prepperdisk.com/pages/about-us

        • prepperdisk 41 minutes ago

          Good feedback. Most of our customers are shopping for a turnkey device and we try not to include prominent details that can confuse that use cases. But we are going to elevate the content for folks that would be interested (today).

        • squigz 7 hours ago

          Yeah, the only mention I can find of many of the open-source projects they use seems to be in one of the FAQs.

          • entropie 7 hours ago

            And even with you writing that it was not easy to find because its under an unrelated topic.

    • entropie 3 hours ago

      I asked Kiwix and Marc replied and said they are indeed affiliated. Marc also stated they offered help and are pretty cool guys.

  • SamBam 15 hours ago

    I'm not really sure I understand how a "Prepper Box" is different from an external hard drive. Unlike the devices in TFA, which are meant for many students in a classroom using at once, this seems to be more of a single-person-looking-up-things concept.

asdefghyk 14 hours ago

A feature that would be good to have is if a "Internet in a Box" could also store the update files for serving to other "internet in Box" machines.

( However having worked as a technical software tester in similar systems for over 20 years , its probably to complex to implement reliably, being able to handle all the edge cases. Is my GUESS )

  • iwantonething 13 hours ago

    I would love a little box that contains a huge database of info with an LLM on top that I could use offline.

kolleykibber 5 hours ago

Have to mention World Possible (https://worldpossible.org/) here. They are doing amazing similar work, with rpis and Intel CAP servers.

They also have Datapost (https://datapost.site/map) which uses android phones as data carriers between the remote location and the Internet, for email, stats and updates.

Sparkyte 13 hours ago

Title reminds me of that time Moss introduces Jen to the internet.

  • pryelluw 12 hours ago

    I don’t think the elders of the internet would mind.

keepamovin 7 hours ago

I like the idea of a hardware set up for this. One thing I'm considering for DiskerNet (dn - an offline, searchable, curated archive of the web that works in a normal browser) is using NNCP for updates. That way you can "request" pages that aren't in the archive, and "patch" them in from hardware updates later, USB sticks, floppies (haha) - whatever! And securely, thanks to the crypto in NNCP.

If anyone has a better idea of a protocol for this than NNCP, please suggest at: https://github.com/DO-SAY-GO/dn/issues/221

fsckboy 9 hours ago

>I also need an IC to convert the I2S signal (currently sent to the MAX98357A) to a headphone output. I haven't researched options yet. The motherboard will need additional routing to send the I2S signal from the ESP32-S3 to both the MAX98357A and this new IC. Suggestions welcome!

isn't a headphone just a high impedance speaker? you just grab the same output that goes to the speakers and reroute it through a big resistor to the headphones.

did a little searching, here you go (the extra resistance isn't that great)

https://samtechpro.blogspot.com/2014/03/how-to-use-speaker-o...

put your efforts into stereo, speaker and headphones, why wouldn't stereo set the stage for more complex brain development?

GuB-42 6 hours ago

It is a bit of a tangent, but I noticed that the picture on the lower right features an OLPC XO laptop. These are very interesting little machines from around 2010 [1], I wonder if some of them are still around today.

[1] https://en.wikipedia.org/wiki/One_Laptop_per_Child

  • whywhywhywhy 5 hours ago

    They never really delivered on the promise and the non-profit structuring of the org wasn’t conducive to actually delivering a product.

  • npteljes 4 hours ago

    There is some on ebay, at least.

asdefghyk 14 hours ago

This article made me remember, that in the early 90s there was a service to retrieve any webpage by email

  • dredmorbius 11 hours ago

    RMS still largely browses in this way:

    I generally do not connect to web sites from my own machine, aside from a few sites I have some special relationship with. I usually fetch web pages from other sites by sending mail to a program (see <https://git.savannah.gnu.org/git/womb/hacks.git>) that fetches them, much like wget, and then mails them back to me. Then I look at them using a web browser, unless it is easy to see the text in the HTML page directly. I usually try lynx first, then a graphical browser if the page needs it.

    <https://stallman.org/stallman-computing.html>

asdefghyk 14 hours ago

This seems to be the Wikipedia webpage for it https://en.wikipedia.org/wiki/Internet-in-a-Box

asdefghyk 13 hours ago

Reminds me of a research paper I once saw many years ago.. It described a dual caching ( web traffic) proxy system. with a (very) slow/ unreliable link in between. The proxies worked together to compress the data as it was transferred across the link. ( The proxy on the receiving sided cached information and only retried if it had changed.) My recollection is it also had options to reduce the resolution of images.

bdcravens 16 hours ago

I feel like this could be useful in an incarcerated environment, where they offer computer classes but can't grant access to the Internet for security reasons.

  • nonrandomstring 16 hours ago

    That sounds like the last few universities I taught in. (quite seriously).

realo 19 hours ago

So this allows people in poor countries to have access ONLY to the best curated resources available on the Internet?

And those people then have a better chance at a much better education?

Why not in developed countries schools as well?

  • skylerwiernik 18 hours ago

    Someone already thought of this and uploaded all of the contents of the box to a website! You can find it at wikipedia.com

    • sgt 17 hours ago

      That's missing the point. Full Internet access is just too broad. Going to wikipedia and aimlessly browsing about is fun, but a more educative approach can narrow the focus for students and especially for younger learners.

      How to market it in developed countries is going to be a tough nut to crack though.

      • bl4ckneon 17 hours ago

        Well there is nothing stopping any school in the developed world from loading this on to a pi or something and having everyone use it too. It's free and open source (from what I can tell).

        It's aimed at places with little to no, or unreliable, internet. So if you have normal internet speed there is nothing you can't get that's on the box. Also it seems that its not even a curated Wikipedia, it's just a full clone of it (assuming for whatever language your downloading)

      • SamBam 15 hours ago

        Plenty of schools have network control over the devices that are used in schools, meaning that you can indeed narrow the focus by only allowing a few websites to be accessed.

        My kid's school uses a software called GoGuardian, which allows individual teachers to whitelist specific websites for the students in their class during their class period.

aspectmin 14 hours ago

Such a cool idea.

In addition to Wikipedia, I’d love to see a mirror of all the health (NIH) and similar data

- key imagery, for example the human body

- (wishful) chatGPT 4o

souhail_dev an hour ago

raspberry pi + SSD + http server

drmacak 6 hours ago

Would be great to have this kind of internet in box but for 90s or 00s internet.

dcreater 17 hours ago

I've seen many such projects - while the problem is real, these solutions in actual practice/deployment are gimmicky and questionably useful at best. I'd like to hear from the actual people using this on a day to day basis - if any exist.

  • thefreeman 17 hours ago

    I think part of the point is you _can't_ hear from them... because they don't have access to the real internet?

fitsumbelay 13 hours ago

Reminds me of PirateBox/Library Box -- more the latter in terms of mission -- but w/ RPI instead of a router running OpenWRT (which was a wild experience ....)

Akatama 11 hours ago

Hey everyone!

I actually recently learned about Internet-in-a-Box myself and started contributing. Its cool to see that for the most part people seem to think this is a positive idea! Internet-in-a-Box has been around for a while (at least 10 years) and it's a great project with a lot of very passionate people working on it.

caseyy 15 hours ago

I wonder if Starlink et al won't soon solve the global internet access problem.

  • metasj 14 hours ago

    Every such provider likes to play games with who they give good rates to or who they exclude altogether from their network. People said the same about cellular access and it's still far too expensive for most people in most rural areas.

    • freefrog1234 7 hours ago

      Which countries are you referring to? I know from personal experience everyone in Cambodia and Afghanistan owns a mobile phone with internet access. They might not have a computer or reliable power, but they have Facebook accounts. Rest of ASEAN has excellent coverage as well, and I've heard Africa is similar

malux85 18 hours ago

This is such a cool idea.

When I was about 7-8 years old I used to get the "Tell me why" books, which were books that had 5-ish pages on all sorts of different topics.

https://archive.org/details/heresmoretellmew0000leok/page/n3...

These books sparked a lifelong curiosity in learning, I would sit for hours and hours and read them in my room. I hope that internet in a box inspires another generation of me's out there, who, like me, wouldn't otherwise have had access to this info.

aspectmin 14 hours ago

A mirror of Khan Academy would be great as well.

  • FerretFred 7 hours ago

    That's already an option on IIAB.

lutusp 13 hours ago

> ... Now you too can put the internet in a box and customize it with the very best free content for your school, clinic or family!

Yes ! And at very low cost! It doesn't require a network, a power connection or high technology! It prepares its users for adult life!

How can people read this article and not think, "Wait ... don't we already have books?"

  • zipping1549 12 hours ago

    Books - take up a lot of space - are not free & not necessarily cheap - are not searchable - take a lot of effort to index - are heavy, prone to damage & loss - cannot be accessed by multiple users simultaneously - need physical copy to back up

    • lutusp 9 hours ago

      Most of these points are true for computer media as well if separated from some degree of computer power, especially "not necessarily cheap".

      > ... need physical copy to back up

      Certainly true for computer storage, which, if left to the whims of nature and given time, will self-destruct.

      Years ago I got a call from Tom Clancy, who was writing "Hunt for Red October" using my word processor "Apple Writer." He said a diskette had become unreadable and asked how to recover its contents. I delivered the bad news and recommended that he use his backup diskette. I'm sure you can guess how that turned out.

aussieguy1234 13 hours ago

Let put this "internet in a box" and send it via baloon into North Korea

Uptrenda 14 hours ago

This has off-grid written all over it. You could build your own content pool for education + entertain and you would never have to pay a forever subscription for an Internet connection. (Assuming you don't work online or something.) Quite an interesting idea to consider how to add the tools to this so its practical to unplug. This idea appeals to me a lot tbh.

teleforce 15 hours ago

This is the limited offline version of the Internet for the rural communities but what we need is the local-first version of the Internet for the rural community [1].

Fun facts, about one third or 2.6 Billions of the world's population has no or very limited Internet connectivity [2]. The main root cause is most probably power not the infrastructure.

Most of the people in authority probably don't realize that this rural connectivity does not need a fast high speed network as long as it has connectivity. It can be slow as kbps bandwidth, a kind of "sipping" Bittorent based download, but a download nonetheless.

The main problem of the Internet connectivity it's not really the infrastructure itself but the overall power budget requirements for the connectivity infrastructure.

We need to bring back the very efficient wireles modulation for the remote and rural Internet as exemplify by the DMR with its very efficient 4-FSK [3],[4]. This type of wireless modulation employed constant envelope modulation that is far more efficient (8 to 15 times more efficient) than the alternative TETRA with comparable bandwith [5]. It's reported that DMR operates on 1 kWh per day while TETRA is on 15 kWh per day thus the former can be sustained by only solar panels but not the latter.

Please note that TETRA itself is not a very efficient modulation with π⁄4 differential quadrature phase-shift keying (π⁄4-DPSK) since it requires linear amplifiers due to its non-constant envelope wireles modulation. It's even worst for typical OFDM based system (e.g Wi-Fi HaLoW, LTE, 5G, etc) [6]. This is because a similar power budget setup to DMR would have required probably around 100 times more power or more than 100 kWh per day including the air-conditioning systems for the linear power amplifier systems [7].

Thus these remote and rural base stations can be potentially powered by merely solar panels and the infrastructure does not need to be expensive since the base station structure can be made from bamboo [8].

[1] Local-first software: You own your data, in spite of the cloud:

https://www.inkandswitch.com/essay/local-first/

[2] About one-third of the global population, or 2.6 billion people, remain offline.

https://www.itu.int/en/mediacentre/Pages/PR-2023-09-12-unive...

[3] Digital mobile radio (DMR):

https://en.wikipedia.org/wiki/Digital_mobile_radio

[4] DMR networks for health emergency management: A case study:

https://www.researchgate.net/publication/220761899_DMR_netwo...

[5] TETRA:

https://en.wikipedia.org/wiki/TETRA

[6] Orthogonal frequency-division multiplexing (OFDM):

https://en.wikipedia.org/wiki/Orthogonal_frequency-division_...

[7] Base Station ON-OFF Switching in 5G Wireless Networks: Approaches and Challenges:

https://www.researchgate.net/publication/315696556_Base_Stat...

[8] IEEE Connecting the Unconnected (CTU) 2022 Challenge Winners:

https://ctu.ieee.org/challenge/2022-ctu-challenge-2/

einpoklum 16 hours ago

It is the opposite of Internet in a box. This exemplifies how, in our day and age, "The Internet" is mostly a one-sided experience where a few large organizations offer a bunch of content, which they control, for your perusal.

It would have been more "Internet in a box" if it would have helped people set up their own services and pages; and if it were extensible using other radio-capable devices.

  • SamBam 15 hours ago

    I was thinking this too.

    They do talk about using FileZilla or Nextcloud to upload files, and mention using CMSes like WordPress, so maybe it's quite possible, just not a big focus.

    I agree that making it easy for teachers, students, and anyone in the community making their own discoverable webpages would be a great aspect to this.

zkiihne 16 hours ago

I want this but an LLM.

  • notarealllama 16 hours ago

    OpenwebUI and you can run quantized or low end models (Llama 3 4b or gemma 4b) on a 4-6gb graphic card.

    It's a game changer to run local (no usage caps for a weekend blitz project)

    • drittich 14 hours ago

      I played with gemma-3-4b-it-qat recently using a mid-tier graphics card and a few things stood out to me:

      1. It was very fast, between 35 and 70 tokens per second, with initial response in under 200ms. That kind of speed is a feature.

      2. It was very useful. I had a brainstorming session with it that was both fluid and fruitful

      3. I can't wrap my head around so much knowledge being contained in about 3GB of data. It seems to know something about everything. Imperfect, but very useful.

zeroday28 16 hours ago

Internet with privacy

CoolDogs 8 hours ago

Huge respect to everyone who’s worked on this, it's really cool!

petesergeant 11 hours ago

Surprised that there are still so many people without access to the internet[0], and even more surprised that that's true in India and China, both of which have pretty high population densities (in most areas, but also, those are the areas people live in, obviously).

> But these user figures also suggest that 652 million people in India did not use the internet at the beginning of 2025, suggesting that 44.7 percent of the population remained “offline” at the start of the year.[1]

I guess I'm mentally comparing this to SE Asia, where smart phone usage (and cell coverage) is ubiquitous; Vietnam at 80%, Philippines at 75%, Thailand at literally everyone. Fewer in Indonesia, but geography there is especially challenging

0: https://www.statista.com/statistics/1155552/countries-highes...

1: https://datareportal.com/reports/digital-2025-india

amelius 18 hours ago

What would it take to give these poor children a Starlink connection?

  • WorldWideWebb 16 hours ago

    Being indebted to Elon and the Orange Administration. Probably not the best solution to a problem like this.

  • 9x39 14 hours ago

    The answer to ubiquitous Internet in an area is going to look like 5G and mobiles, not highly valuable terminals you now saddle them with having to guard.

  • esseph 16 hours ago

    So far my answer would be: "US government subsidies"

nsonha 13 hours ago

So by intent it's like a CDN node, but decentralized

didgeoridoo 17 hours ago

“This, Jen, is the Internet.”

bobsmooth 18 hours ago

This is cool but I feel like buying a 1000 count of cheap USB sticks and loading them with wikipedia or whatever would be cheaper and more useful.

  • minhazm 17 hours ago

    What would they be plugging these USB flash drives into? Underdeveloped countries / regions have very low penetration with traditional laptops / desktop computers. But nearly everyone has a smartphone of some kind, which has WiFi. That's the reason the form factor of these are mobile hotspots.

    • drilbo 16 hours ago

      Well, I have a few flash drives with USB-C connections built in, and 1 or 2 with USB-B

    • brewdad 17 hours ago

      Won’t most Android phones accept a USB device for reading/storage? Not much iPhone penetration in the developing world.

  • metasj 14 hours ago

    At the least you need storage for the content ($10-20), and a physical device to read it off of ($20-40). If your community has lots of computers with no wireless antennas (old servers), USB keys that can be passed around make sense. If you have lots of mobile devices with wireless (phones or tablets), a hotspot like this makes sense and can be used by dozens of people at once.

    In practice USB keys run a slightly higher risk of being wiped and repurposed for personal storage. Some IIAB users glue the SD card into their socket for that reason, making it take a lot longer to swap out.

  • blacksmith_tb 17 hours ago

    Isn't the idea that you only need a minimal device with a browser to access stuff on the portable hotspot? Handing out USB drives would be easy, but people who had only phones might not find them as practical (and they'd tend to get wiped / sold / lost / etc.)

  • jackphilson 18 hours ago

    probably access to chatgpt4o is a million times more useful than wikipedia imo

    • dymk 17 hours ago

      Cool all they'll need to do is find $20/month per person for the subscription, how hard could that be in rural South Africa or Nepal?

    • cookie_monsta 17 hours ago

      Yes, why does the developing world need an education system when we could be flooding them with AI hallucinations?

aaron695 16 hours ago

Internet in a Box DOES NOT WORK

I don't know how many more times this needs to be proven.

You do not understand data, you do not understand the reality around you, you do not understand 80:20 style engineering rules, you can't look at previous implementations of the Internet in a Box and see it didn't work.

Worse, you now live in a Starlink era where you can give them a real "Internet in a Box" and there is no solution people can just roll out. Talking a proper Linux setup, real hacker and nerd stuff, but because it's not Data Hording 101 no one will tackle it.

Internet in a Box is a great example of why Western foreign aid is failing and China is moving in. The West no longer builds infrastructure for the poor (like a Linux build) and feeds them wishy washy stuff. Facebook does more for the poor in this area.

  • nchmy 13 hours ago

    Could you elaborate on/point us towards resources about why it does not work?

    • aaron695 10 hours ago

      It's been around since 2012 and the best Wikipedia has - "explored using these boxes in the Dominican Republic for three months." - https://en.wikipedia.org/wiki/Internet-in-a-Box

      Where is a follow up study 3 years later? Are there any follow up studies [1]?

      (I have said in past comments the medical version might work, it failed in the Dominican Republic [1])

      One follow up study - [1] https://www.sipa.columbia.edu/sites/default/files/2023-01/In... which is not real positive.

      Why does it not work? An idea to think about, physical encyclopedias and university textbooks have been around forever and accessible.

      I've worked in schools with 1000's on one ADSL trying to use proxies to cache. I've volunteered for a few years in country on mobile apps for use in very low PPI villages without internet. I'm not coming at this with zero experience.

      It's impossible to discuss these things on HN, there is neither the technical or process knowledge here, I could be totally wrong but I have looked hard to disprove myself.

      The solution I propose Starlink -> box with billing / access control + terabytes of offline vids/books + fun online games, that an idiot with determination can manage and make a profit or run free and works well in a country where the MPAA doesn't exist.

      • nchmy 2 hours ago

        I've spent the past 8 years focused on this theme, more or less, because my conclusion after finding things like internet on a box (and various similar projects) was that they offer too much information (and also present it in a useless manner).

        I've lived in many deeply impoverished communities, and they don't need a full version of Wikipedia, or a medical encyclopedia. They need to know how to easily and cheaply implement basic sanitation and hygiene, basic first aid and nutrition, efficient sustainable cooking and agricultural practices etc... Plenty of effort was spent on that stuff in the Appropriate Technology movement, but all I ever see from it all is wiki articles or 70 page pdfs (which many can't even read, even even if they were in their native language). No one is learning anything from those resources.

        I brought this up with some orgs similar to IIAB, and they were just bewildered. So, I made it my life's work to address this in a genuinely useful, accessible, practical way. Hopefully I'll be able to share something in the next year or two...

        This, of course, is not to say that they don't have the current or future potential to learn these things, let alone that such info should be withheld. But it's just a matter of practicality of time, resources, efficacy etc. If they can get out of extreme poverty - which many basic things can very much help with - maybe they'll have more time, health and money to pursue academic education, etc

        The proof of this, of course, is that we have access to all of humanity's knowledge, and have self-evidently done very little with it.

      • RenThraysk 3 hours ago

        Your analogy with books doesn't really work. As a book is only usable by one individual at a time. Whereas a digital book can be shared.