Sunday, June 9, 2013

Speech issues; changing course again

Sorry it's been awhile since my last post. I've been swimming in the deep end of Linux speech recognition, and unfortunately I haven't had much luck. I was able to get CMU sphinx compiled and "working," but the accuracy of the recognition was extremely low, so I had to find an alternative. I attempted to follow the voxforge guide for setting up Julius and HTK, but I ran into a lot of trouble there as well. HTK would not compile correctly with the Pi's gcc 4.6, and after several hours of searching I was not able to find a gcc version under 4 that would work with the Pi.

Due to time constraints, I've decided to abandon speech recognition for pi-amp. Instead, I'm going to build a custom media player in python controlled by touch. For the touch aspect, I've ordered this kit from ebay that comes with a 7" monitor, a 7" touchscreen panel to fit the monitor, and a controller board that converts the touch input to usb.


Here is a rough rendition of the interface:

















The top-left button will send a "sudo halt" to the Pi (shutdown command), so that I can safely turn the Pi off in the vehicle.

The middle-left button may or may not be in the final rendition, but I decided to create it at least for the testing phases of the application. This button will reboot the Pi.

The lower-left button will open a mixer/equalizer menu to adjust bass & treble.

The music button will allow the user to browse the contents of the music library, as well as select a song to play. The primary display area will be replaced with a file browser. Once a file is selected, the user will be returned to the main interface.

The video button will allow the user to browse the contents of the video library, as well as select a video to play. The primary display area will be replaced with a file browser. Once a file is selected, the primary display area will be replaced with the chosen video.

The central info-box will display information about the currently playing file.

As to the player controls, they are fairly self explanatory. From left to right:

  • mute/unmute toggle button (will switch graphics when toggled)
  • previous
  • rewind
  • play/pause toggle button (will switch graphics when toggled)
  • stop
  • fast forward
  • next

The last button on the right is a placeholder graphic - I'm going to come up with something unique to the project. This button will call the main menu to the primary display area. If playing a video, the user can press this main menu button again to re-hide the main menu.


I still need to find a good place for a volume control, and possibly a horizontal seek slider so the user can jump to different parts of the file easier.


This next week should have several posts - I have a lot to get done. First, I'm going to get a more final draft of the interface done, and then I'll be starting work on the Python code. Stay tuned!


No comments:

Post a Comment