Humane AI Pin

I wrote a blog post The Shape of Computers [1] exploring ideas of how computers might evolve and how we can use them. One of the devices I mentioned was the Humane AI Pin, which has just been the recipient of one of the biggest roast reviews I’ve ever seen [2], good work Marques Brownlee! As an aside I was once given a product to review which didn’t work nearly as well as I think it should have worked so I sent an email to the developers saying “sorry this product failed to work well so I can’t say anything good about it” and didn’t publish a review.

One of the first things that caught my attention in the review is the note that the AI Pin doesn’t connect to your phone. I think that everything should connect to everything else as a usability feature. For security we don’t want so much connecting and it’s quite reasonable to turn off various connections at appropriate times for security, the Librem5 is an example of how this can be done with hardware switches to disable Wifi etc. But to just not have connectivity is bad.

The next noteworthy thing is the external battery which also acts as a magnetic attachment from inside your shirt. So I guess it’s using wireless charging through your shirt. A magnetically attached external battery would be a great feature for a phone, you could quickly swap a discharged battery for a fresh one and keep using it. When I tried to make the PinePhonePro my daily driver [3] I gave up and charging was one of the main reasons. One thing I learned from my experiment with the PinePhonePro is that the ratio of charge time to discharge time is sometimes more important than battery life and being able to quickly swap batteries without rebooting is a way of solving that. The reviewer of the AI Pin complains later in the video about battery life which seems to be partly due to wireless charging from the detachable battery and partly due to being physically small. It seems the “phablet” form factor is the smallest viable personal computer at this time.

The review glosses over what could be the regarded as the 2 worst issues of the device. It does everything via the cloud (where “the cloud” means “a computer owned by someone I probably shouldn’t trust”) and it records everything. Strange that it’s not getting the hate the Google Glass got.

The user interface based on laser projection of menus on the palm of your hand is an interesting concept. I’d rather have a Bluetooth attached tablet or something for operations that can’t be conveniently done with voice. The reviewer harshly criticises the laser projection interface later in the video, maybe technology isn’t yet adequate to implement this properly.

The first criticism of the device in the “review” part of the video is of the time taken to answer questions, especially when Internet connectivity is poor. His question “who designed the Washington Monument” took 8 seconds to start answering it in his demonstration. I asked the Alpaca LLM the same question running on 4 cores of a E5-2696 and it took 10 seconds to start answering and then printed the words at about speaking speed. So if we had a free software based AI device for this purpose it shouldn’t be difficult to get local LLM computation with less delay than the Humane device by simply providing more compute power than 4 cores of a E5-2696v3. How does a 32 core 1.05GHz Mali G72 from 2017 (as used in the Galaxy Note 9) compare to 4 cores of a 2.3GHz Intel CPU from 2015? Passmark says that Intel CPU can do 48GFlop with all 18 cores so 4 cores can presumably do about 10GFlop which seems less than the claimed 20-32GFlop of the Mali G72. It seems that with the right software even older Android phones could give adequate performance for a local LLM. The Alpaca model I’m testing with takes 4.2G of RAM to run which is usable in a Note 9 with 8G of RAM or a Pixel 8 Pro with 12G. A Pixel 8 Pro could have 4.2G of RAM reserved for a LLM and still have as much RAM for other purposes as my main laptop as of a few months ago. I consider the speed of Alpaca on my workstation to be acceptable but not great. If we can get FOSS phones running a LLM at that speed then I think it would be great for a first version – we can always rely on newer and faster hardware becoming available.

Marques notes that the cause of some of the problems is likely due to a desire to make it a separate powerful product in the future and that if they gave it phone connectivity in the start they would have to remove that later on. I think that the real problem is that the profit motive is incompatible with good design. They want to have a product that’s stand-alone and justifies the purchase price plus subscription and that means not making it a “phone accessory”. While I think that the best thing for the user is to allow it to talk to a phone, a PC, a car, and anything else the user wants. He compares it to the Apple Vision Pro which has the same issue of trying to be a stand-alone computer but not being properly capable of it.

One of the benefits that Marques cites for the AI Pin is the ability to capture voice notes. Dictaphones have been around for over 100 years and very few people have bought them, not even in the 80s when they became cheap. While almost everyone can occasionally benefit from being able to make a note of an idea when it’s not convenient to write it down there are few people who need it enough to carry a separate device, not even if that device is tiny. But a phone as a general purpose computing device with microphone can easily be adapted to such things. One possibility would be to program a phone to start a voice note when the volume up and down buttons are pressed at the same time or when some other condition is met. Another possibility is to have a phone have a hotkey function that varies by what you are doing, EG if bushwalking have the hotkey be to take a photo or if on a flight have it be taking a voice note. On the Mobile Apps page on the Debian wiki I created a section for categories of apps that I think we need [4]. In that section I added the following list:

Voice input for dictation
Voice assistant like Google/Apple
Voice output
Full operation for visually impaired people

One thing I really like about the AI Pin is that it has the potential to become a really good computing and personal assistant device for visually impaired people funded by people with full vision who want to legally control a computer while driving etc. I have some concerns about the potential uses of the AI Pin while driving (as Marques stated an aim to do), but if it replaces the use of regular phones while driving it will make things less bad.

Marques concludes his video by warning against buying a product based on the promise of what it can be in future. I bought the Librem5 on exactly that promise, the difference is that I have the source and the ability to help make the promise come true. My aim is to spend thousands of dollars on test hardware and thousands of hours of development time to help make FOSS phones a product that most people can use at low price with little effort.

Another interesting review of the pin is by Mrwhostheboss [5], one of his examples is of asking the pin for advice about a chair but without him knowing the pin selected a different chair in the room. He compares this to using Google’s apps on a phone and seeing which item the app has selected. He also said that he doesn’t want to make an order based on speech he wants to review a page of information about it. I suspect that the design of the pin had too much input from people accustomed to asking a corporate travel office to find them a flight and not enough from people who look through the details of the results of flight booking services trying to save an extra $20. Some people might say “if you need to save $20 on a flight then a $24/month subscription computing service isn’t for you”, I reject that argument. I can afford lots of computing services because I try to get the best deal on every moderately expensive thing I pay for. Another point that Mrwhostheboss makes is regarding secret SMS, you probably wouldn’t want to speak a SMS you are sending to your SO while waiting for a train. He makes it clear that changing between phone and pin while sharing resources (IE not having a separate phone number and separate data store) is a desired feature.

The most insightful point Mrwhostheboss made was when he suggested that if the pin had come out before the smartphone then things might have all gone differently, but now anything that’s developed has to be based around the expectations of phone use. This is something we need to keep in mind when developing FOSS software, there’s lots of different ways that things could be done but we need to meet the expectations of users if we want our software to be used by many people.

I previously wrote a blog post titled Considering Convergence [6] about the possible ways of using a phone as a laptop. While I still believe what I wrote there I’m now considering the possibility of ease of movement of work in progress as a way of addressing some of the same issues. I’ve written a blog post about Convergence vs Transferrence [7].

etbe – Russell Coker

Linux, politics, and other interesting things

Humane AI Pin