Tag Archives: DIY

Make A Natural Language Phone Bot Like Google’s Duplex AI

After seeing how Google’s Duplex AI was able to book a table at a restaurant by fooling a human maître d’ into thinking it was human, I wondered if it might be possible for us mere hackers to pull off the same feat. What could you or I do without Google’s legions of ace AI programmers and racks of neural network training hardware? Let’s look at the ways we can make a natural language bot of our own. As you’ll see, it’s entirely doable.

Breaking Down The Solution

One of the first steps in engineering a solution is to break it down into smaller steps. Any conversation consists of a back-and-forth between two people, or a person and a chunk of silicon in our case.

Conversation process with chatbotLet’s say we want to create a bot which can order a pizza for us over the phone. The pizza place first says something to us. Some software then converts that speech to text or breaks it down into some other useful form. More software then formulates a response. And lastly, text-to-speech software or pre-recorded sound bites reply to the pizza place through a speaker into the phone.

The first half of the solution falls under the purview of natural language processing, at least part of which involves converting speech to a form which software can easily understand.

Converting Speech To Text

While there are plenty of open software options for converting text to speech, there aren’t as many for going the other way, from speech to text. They’re also typically in the form of libraries, which is fine for our use. Examples of open ones are CMU Sphinx, Julius, and Kaldi.

More recently, Mozilla has been working one called DeepSpeech which uses TensorFlow and deep learning. We’ve seen it used once so far when [Michael Sheldon] adapted it to convert speech to text which he then injects into X applications.

Understanding The Text

Once you’ve converted the speech to text, what do you do with it?

In our diagram, the human at the pizza place asked us “Will that be all?”. This could have been worded any number of other ways, for example: “Is that it?”, “That’s all?”.

Pizza ordering chatbot decision tree

One way to handle all these possibilities is to write the formulate-a-response code by throwing together a bunch of if-then-else statements, or perhaps write up a parser backed by some tables. If the conversation is expected to be structured then you can create a decision tree and have the code use that as a guide.

AIML (Artificial Intelligence Markup Language) makes that approach easier. AIML was created between 1995 and 2002 by Richard Wallace and has been the basis for a number of chatbots since, including an award-winning one called A.L.I.C.E. Since 2013, the A.L.I.C.E. foundation has been working on a specification for AIML 2.0.

With AIML, you fill an XML file with all the possible things the pizza place could say. The number of them can be minimized using patterns such as “Hi *”, but the pattern language in AIML is limited. It also allows you to provide responses and to limit the conversation to specific topics as they arise. And among its many other features, it has the ability to learn by writing novel things to a file.

For starting out with AIML, see the docs at pandorabots.com. There is also a relatively old interpreter called ProgramAB.

This video shows AIML in use by an open source InMoov robot.

Determining Intent

Intent exampleMuch of the process-speech portion of our solution basically involves figuring out the intent of whatever the pizza place is saying, except that if our code is a mass of if-then-else statements or decision tree structures, it might not seem that way. Ultimately, when the pizza place asks in one of its myriad ways if that’s all we’d like to order, we’d like to boil down all the possibilities down to a single, simple intent, “asking_is_that_all”.

Or the intent may come with additional data for us to use. They may say “It’ll be ready in 20 minutes.” or “You can pick it up in 20 minutes.”. In that case we can label the intent “give_order_ready_time” and store the duration, 20 minutes, as additional data.

Online Services

Free online services exist which do both the speech recognition and determining the intent and capturing of any data. Wit.ai, owned by Facebook, is one such. Another is DialogFlow, formerly Api.ai and now owned by Google. DialogFlow does charge for some things, but nothing a hacker would need. IBM’s Watson Assistant is also free but has a mix of limits.

Wit.ai, DialogFlow, and Watson Assistant chatbotsWhile Wit.ai does speech recognition and intent determination, DialogFlow and Watson implement the full decision tree, allowing you to use their UIs to script the whole conversation.

Ordering Pizza Using Wit.ai

I decided to try out Wit.ai and here’s the resulting conversation, placing an order for a pizza with a fictitious Johnny’s Pizza. Disclosure: No phone call was actually made, but more on that below.


In brief, here’s how I did it. First, I wrote up a script with all the possible combinations of things Johnny’s Pizza could say as well as what my bot should respond with. Then I went to Wit.ai and created an App. That involved giving it all the things in my script which Johnny’s Pizza says and for each one, assigning intents and indicating any data which should be reported to my code.

Wit.ai expressions and entities

In Wit.ai you actually create entities, of which intents are just one type, but I found my code was easier to write if I made everything an intent. Shown here is a snapshot of some of the expressions, i.e. the things the pizza place might say. I’ve expanded the “Will that be all?” one to show the intent entity with a value of “asking_is_that_all”, which is what I’ll look for in my code. The expression above it and the one below it share that same entity so for any of them my code only has to look for “asking_is_that_all”.

After that, it was just a matter of writing some Python code on my Raspberry Pi based on their docs and example code on their Github. I have an amplifier (a noisy DIY one) and speaker attached to the Pi. I recorded a separate sound clip for each part of the conversation and saved them in individual .wav files. Since my voice was used for both sides of the conversation, I deepened the voice of the bot’s side.

In the code, I iterate through the sound clips for the pizza place as if I’d just received them from a phone, sending them one at a time to Wit.ai. Wit.ai does the speech recognition and analysis and returns the intent and data. I also play the clip to the speaker. Then I use the intent to figure out which of the bot’s clips to play in response. What you hear above is the resulting conversation just as I heard it from the Pi’s speaker.

The code can be found on our Github.

The Ultimate: How Google Duplex AI Did It

Listen again to the conversations Google’s Duplex AI had and you’ll be astounded at the language produced by the AI. Impressive as that is, more amazing is that there’s no if-then-else or decision tree involved. Instead, all that logic was trained into a neural network using copious amounts of sample phone conversations on hardware we can only dream of (or pay to use through online services). So for now we’ll have to do that part the old school way.

Adding Natural Language To AIML

One thing we can do, which would be a great open source project, would be to combine something like DeepSpeech with AIML, producing something more similar to DialogFlow or IBM Watson. Perhaps then ordering a pizza over the phone will become only a matter of pressing a button, or we could hook it up to Alexa and have her initiate it. Of course, we might want to announce that we’re a bot at the start of the call and be alerted to intervene if the conversation goes awry. Or record the conversations for posterity, so that the AIs have something to laugh about in ten years.

from Blog – Hackaday https://ift.tt/2MMm1Qn
via IFTTT

Wiping Robots and Floors: STM32duino Cleans up

Ever find yourself with nineteen nameless robot vacuums lying around? No? Well, [Aaron Christophel] likes to live a different life, filled with zebra print robots (translated). After tearing a couple down, only ten vacuums remain — casualties are to be expected. Through their sacrifice, he found a STM32F101VBT6 processor acting as the brains for the survivors. Coincidentally, there’s a project called STM32duino designed to get those processors working with the Arduino IDE we either love or hate. [Aaron Christophel] quickly added a variant board through the project and buckled down.

Of course, he simply had to get BLINK up and running, using the back-light of the LCD screen on top of the robots. From there, the STM32 processors gave him a whole 80 GPIO pins to play with. With a considerable amount of tinkering, he had every sensor, motor, and light under his control. Considering how each of them came with a remote control, several infra-red sensors, and wheels, [Aaron Christophel] now has a small robotic fleet at his beck and call. His workshop must be immaculate by now. Maybe he’ll add a way for the vacuums to communicate with each other next. One robot gets the job done, but a whole team gets the job done in style, especially with a zebra print cleaner at the forefront.

If you want to see more of his work, he has quite a few videos on his website demonstrating the before and after of the project — just make sure to bring a translator. He even has a handy pinout for those looking to replicate his work. If you want to dive right in to STM32 programming, we have a nice article on how to get it up and debugged. Otherwise, enjoy [Aaron Christophel]’s demonstration of the eight infra-red range sensors and the custom firmware running them.

from Blog – Hackaday https://ift.tt/2yu1IEb
via IFTTT

Wiping Robots and Floors: STM32duino Cleans up

Ever find yourself with nineteen nameless robot vacuums lying around? No? Well, [Aaron Christophel] likes to live a different life, filled with zebra print robots (translated). After tearing a couple down, only ten vacuums remain — casualties are to be expected. Through their sacrifice, he found a STM32F101VBT6 processor acting as the brains for the survivors. Coincidentally, there’s a project called STM32duino designed to get those processors working with the Arduino IDE we either love or hate. [Aaron Christophel] quickly added a variant board through the project and buckled down.

Of course, he simply had to get BLINK up and running, using the back-light of the LCD screen on top of the robots. From there, the STM32 processors gave him a whole 80 GPIO pins to play with. With a considerable amount of tinkering, he had every sensor, motor, and light under his control. Considering how each of them came with a remote control, several infra-red sensors, and wheels, [Aaron Christophel] now has a small robotic fleet at his beck and call. His workshop must be immaculate by now. Maybe he’ll add a way for the vacuums to communicate with each other next. One robot gets the job done, but a whole team gets the job done in style, especially with a zebra print cleaner at the forefront.

If you want to see more of his work, he has quite a few videos on his website demonstrating the before and after of the project — just make sure to bring a translator. He even has a handy pinout for those looking to replicate his work. If you want to dive right in to STM32 programming, we have a nice article on how to get it up and debugged. Otherwise, enjoy [Aaron Christophel]’s demonstration of the eight infra-red range sensors and the custom firmware running them.

from Blog – Hackaday https://ift.tt/2yu1IEb
via IFTTT

The Best New Amiga Title of 2018?

Just because a system becomes obsolete for most of us doesn’t mean that everyone stops working with them. Take a look at this brand new game for the Amiga 500 called Worthy, which is sure to make most of us regret ever upgrading our home computers, despite the improvements made since 1987.

The group who developed the game is known as Pixelglass and they have done a lot of work on this platform, releasing several games over the past few years. Their latest is Worthy, an action-adventure game that looks similar to the top-down perspective Zelda games from the SNES. It’s an impressive piece of work for a system that few of us own anymore, but if you have one (or even if you have a good emulator) you might want to give it a whirl.

If developing games for retro systems is your style, this isn’t limited to personal computers like the Amiga. We’ve seen development platforms for the Super Nintendo that will let you run your own code, and even other methods for working with the Sega Saturn if you’re feeling really adventurous.

Thanks to [Chappy1978] for the tip!

from Blog – Hackaday https://ift.tt/2M6io6P
via IFTTT

Desktop Radio Telescope Images The WiFi Universe

It’s been a project filled with fits and starts, and it very nearly ended up as a “Fail of the Week” feature, but we’re happy to report that the [Thought Emporium]’s desktop WiFi radio telescope finally works. And it’s pretty darn cool.

If you’ve been following along with the build like we have, you’ll know that this stems from a previous, much larger radio telescope that [Justin] used to visualize the constellation of geosynchronous digital TV satellites. This time, he set his sights closer to home and built a system to visualize the 2.4-GHz WiFi band. A simple helical antenna rides on the stepper-driven azimuth-elevation scanner. A HackRF SDR and GNU Radio form the receiver, which just captures the received signal strength indicator (RSSI) value for each point as the antenna scans. The data is then massaged into colors representing the intensity of WiFi signals received and laid over an optical image of the scanned area. The first image clearly showed a couple of hotspots, including a previously unknown router. An outdoor scan revealed routers galore, although that took a little more wizardry to pull off.

The videos below recount the whole tale in detail; skip to part three for the payoff if you must, but at the cost of missing some valuable lessons and a few cool tips, like using flattened pieces of Schedule 40 pipe as a construction material. We hope to see more from the project soon, and wonder if this FPV racing drone tracker might offer some helpful hints for expansion.

from Blog – Hackaday https://ift.tt/2yuLue1
via IFTTT

Learn Something About Phase Locked Loops

The phase locked loop, or PLL, is a real workhorse of circuit design. It is a classic feedback loop where the phase of an oscillator is locked to the phase of a reference signal using an error signal in the same basic way that perhaps a controller would hold a temperature or flow rate in a physical system. That is, a big error will induce a big change and little errors induce little changes until the output is just right. [The Offset Volt] has a few videos on PLLs that will help you understand their basic operation, how they can multiply frequencies (paradoxically, by dividing), and even demodulate FM radio signals. You can see the videos below.

The clever part of a PLL can be found in how it looks at the phase of two signals. For signals to be totally in phase, they must be at the same frequency and also must ebb and peak at the same point. It should be clear that if the frequency isn’t the same the ebbs and peaks can’t line up for any length of time. By detecting how much the signals don’t line up, an error voltage can be generated. That error voltage is used to adjust the output oscillator so that it matches the reference oscillator.

Of course, it wouldn’t be very interesting if the output frequency had to be the same as the reference frequency. The clever trick comes by dividing the output frequency. For example, a 100 MHz crystal oscillator is difficult to design. But taking a voltage-controlled oscillator at 100 MHz (nominal) and dividing its output by 100 will give you a signal you can lock to a 1 MHz crystal oscillator which is, of course, trivial to build.

The real detail lies in the phase comparison and the loop filtering, something that will make more sense once you have watched the videos.

Based on video views, [The Offset Volt] may be the best YouTube channel people aren’t watching much. The videos are clear an easy to understand! He’s even worked in a reference to a self-sealing stem bolt, so we can’t help but be impressed with that.

Our resident video star [Bil Herd] did a talk on PLLs a good while back. If you’d rather read and want to see how a PLL works in software, we’ve talked about that, too.

from Blog – Hackaday https://ift.tt/2teJiCq
via IFTTT

Trackball Gets Bolt-On Button Upgrade

The question of whether to use a mouse versus a trackball is something of a Holy War on the level of Vi versus Emacs. We at Hackaday want no part of such things, use whatever you want, and leave us out of it. But we will go as far as to say that Team Trackball seems to take things mighty seriously. We’ve never met a casual trackball user: if they’ve got a trackball on their desk then get ready to hear all about it.

With that in mind, the lengths [LayeredDesigns] went to just to add a couple extra buttons to his CST trackball make a bit more sense. Obviously enamored with this particular piece of pointing technology, he designed a 3D printed “sidecar” that you can mount to the left side of the stock trackball. Matching the shape of the original case pretty closely, this add-on module currently hosts a pair of MX mechanical keys, but the plans don’t stop there.

[LayeredDesigns] mentions that all the free room inside the shell for this two-button modification has got him thinking of what else he could fit in there. The logical choice is a Teensy emulating a USB HID device, which could allow for all sorts of cool programmable input possibilities. One potential feature he mentioned was adding a scroll wheel, which the Teensy could easily interface with and present to the operating system.

We’ve seen our fair share of 3D printed keyboards and keyboard modifications, but we can’t say the same about the legendary trackball. Ones made of cardboard, sure. Pulled out of a military installation and hacked to add USB? You bet. This project is just more evidence of what’s possible with a 3D printer, a caliper, and some patience.

[via /r/functionalprint]

from Blog – Hackaday https://ift.tt/2te4ett
via IFTTT