First Semester Done: Artificial Intelligence and Database Management Systems

I survived my first semester in the PhD of Computer Science program at Nova Southeastern University in Ft. Lauderdale, FL.  The two classes that I took this semester were CISD 750: Database Management Systems and CISD 760: Artificial Intelligence.  Both classes were a challenging mix of research, exams and papers.  I ended up with an A in AI and an A- in DBMS.  Both required considerable work.  I gained valuable knowledge insights for my ultimate dissertation.  Both instructors were helpful and very knowledgeable of their topics.

Journal articles were a major part of both classes.  There were over 20 assigned papers for the database class and I reviewed nearly that many for the AI class.  Reading journal articles was something that I expected from a doctorate program.  While this is a distance degree program, both classes also had challenging in-class mid-term examinations.  The program requires me to spend four days, in two trips, on-campus per semester.  I traveled to Ft. Lauderdale both in August and October.

For this semester, the AI professor was by Sumitra Mukherjee and the database professor was by Junping Sun.  Both were great!  I would very much recommend classes taught by either.

CISD 750: Database Management Systems

The database management class covered many aspects of DBMS design.  I had not previously taken an academic class on database systems, so there was quite a bit of new material for me.  I had not worked with relational algebra prior to this class.  I also learned about functional dependancies, schema normalization, the chase algorithm, hashing, indexing, dynamic programming, query optimization, and other topics.  Most of the focus of the class was on relational databases.  However, some time was spent looking into some NoSQL topics.  The final and mid-term both tested our abilities to work through these algorithms.

I chose to do my research paper on frequent itemset mining.  This is a Knowledge Discovery in Databases (KDD) topic.  I compared the Apriori, Eclat and FP-Growth algorithms.  I did an emperical study of what datasets are conducive to each algorithm.  Most papers that I read about these algorithms studied the performance effects of varying the support threshold using known datasets.  I wanted to take a different approach, so I looked at how I could generate a dataset a specified frequent itemset density and number of frequent items.  I created a Python application to simulate generate this data.  I was happy with how the paper turned out.  I plan on posting the results from this research in a separate post here in the future.

I really liked the textbook that was chosen for this class.  Database Systems: The Complete Book (2nd Edition) appears to be a classic in the field of DBMS.  The examples and explanations were really clear.  The material was very different than what I am used to for databases.  I've worked with databases, such as Oracle and MySQL, for years.  It was very interesting to see DBMS at a more academic level.  I feel that I gained a deeper understanding of database topics.

CISD 760: Artificial Intelligence

The AI class covered a wide range of topics that provided a very good foundation of AI.  The optional textbook was Artificial Intelligence a Modern Approach.  I feel the textbook was a great choice, and I already owned a copy before the course began.  Topics covered in this course included A*, Neural Networks, Decision Trees, Bayesian Inference, and Genetic Algorithms.  We were given an assignment were we had to make use of several of these algorithms.  I used Encog for the Genetic Algorithm and A* portions of the programming assignment.

Another assignment asked us to choose a peer reviewed article and perform a critique.  For my article, I choose:

Larochelle, H., Bengio, Y., Louradour, J., & Lamblin, P. (2009, June). Exploring strategies for training deep neural networks. J. Mach. Learn. Res., 10 , 1-40.

I am interested in deep learning and wanted to research, and understand, some of the issues with applying it to continuous inputs.  I ended up implementing a deep belief neural network in Java.  The source code to this is on GitHub.  I will make use of this code for Volume 3 of my AIFH series.

The major paper for this class was to write an idea paper.  Idea papers are used in this program to capture potential ideas for a dissertation.  I wrote an idea paper that detailed research that I might like to perform to use continuous input with deep learning.  I believe that I would like to do my dissertation in the area of AI.  I am not sure I will choose deep learning.  Nevertheless, the paper was a great exercise.  There are several areas that I gently nudged the boundary of human understanding, while writing parts of Encog.  I plan to explore several of these for a potential dissertation.  I also still have one more semester before I need to really get serious about a dissertation topic.

Next Semester

Next semester I am taking two courses again, and then two more in the fall of 2015.  Once I am through these courses I have two more semesters where I will split between a single course and research hours.  After that I will be a phd candidate and on to dissertation work.

The two courses that I will take next semester (Winter 2015) are:

  • CISD 792  Computer Graphics
  • ISEC 730 / DCIS 730 Network Security and Cryptography

The textbooks for my computer graphics class are shown here.  There is no assigned book for the security class.  My guess is there will be a number of assigned papers for the security class.

books_winter_2015

 

I am looking forward to the next semester!  It looks like I will be learning about three.js in the graphics class.  I am really looking forward to that.

Handling Multiple Java Versions on a Mac

I would like to make use of Java 8 in Encog now, as well as the upcoming volume 3 of AIFH.  I primarily use a Mac at home and a Microsoft Surface while traveling.  In this post, I will describe how I setup the Mac to use Java 8.

The first step is to install Java 8, or whatever version of Java you wish to use.  These can be found from the following directory.

http://www.oracle.com/technetwork/java/javase/downloads/index.html

This will download a Mac package to install.  Once you install this package, you want to get to the point that you can go to a command line and issue the command java -version and see the correct version.  For example:

jeffs-mbp:~ jheaton$ java -version
java version "1.8.0_25"
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
jeffs-mbp:~ jheaton$ 

To do this you must first find the exact version number of Java that you wish to use.  The java_home command can do that.

jeffs-mbp:~ jheaton$ /usr/libexec/java_home  -V
Matching Java Virtual Machines (5):
    1.8.0_25, x86_64:	"Java SE 8"	/Library/Java/JavaVirtualMachines/jdk1.8.0_25.jdk/Contents/Home
    1.7.0_25, x86_64:	"Java SE 7"	/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home
    1.7.0_13, x86_64:	"Java SE 7"	/Library/Java/JavaVirtualMachines/jdk1.7.0_13.jdk/Contents/Home
    1.6.0_65-b14-462, x86_64:	"Java SE 6"	/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
    1.6.0_65-b14-462, i386:	"Java SE 6"	/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home

/Library/Java/JavaVirtualMachines/jdk1.8.0_25.jdk/Contents/Home
jeffs-mbp:~ jheaton$

Now that you have the version number, you must add the following line to your ~/.bash_profile file.

export JAVA_HOME=`/usr/libexec/java_home -v 1.8.0_25, x86_64`

Make sure to remove any other JAVA_HOME directives.

 

The 18 Kickstarter Projects that I've Backed

As of today I've backed 20 Kickstart projects, have run two successful projects and am planning my third.  Of these 20 backed projects, 18 were funded.  In this post I will give a summary of my experiences, as a backer, with Kickstarter.  Here are a few of the items I've received.

jheaton_backer

I have not yet experienced a Kickstarter project that has gone completely AWOL.  There are certainly cases of this.  It is not something that has occurred, yet, in a project that I've backed.  Most Kickstarter projects do run late.  The initial project estimate is just that, an estimate.  Few projects make it.  These projects are often in completely uncharted waters of design, production and international order fulfillment.  Once a project is late, communication is the key.  At this point you need for your backers to keep faith in your ability to fulfill the project.  I've seen projects run late by over a year, and still pull through.

Another issue that I've observed is that backers often treat Kickstarter as a sort of online mall.  You should not be buying Kickstarter projects as birthday or Christmas gifts.  Often backers do exactly that, seeing the expected delivery date falls a few months before the date that the backer needs the gift.  Then a project runs behind, Christmas is missed, and emotions run high.  I've actually seen a few Kickstarter projects miss two Christmases in a row, and still deliver well over two years behind.  This is the nature of backing a product that is not yet complete, much less in production.  I believe that Kickstarter is about being involved in an immature product that you believe in!

Here is a summary of the Kickstarter projects that I've backed (in the order I backed them):

  1. Code Monkey Save World: Major fan of Jonathan Coulton!  The project ran late, but was every thing I expected and more.
  2. Sparki - The Easy Robot for Everyone: Cool robotics platform!  just a few months late, and everything I expected.
  3. Dog Sled Saga: Cute little platformer game.  Delivered an alpha on-time and continues to evlove.
  4. LightUp: Learn by Making: Really neat electronics kit concept, but the project has had a number of issues.  I did get the key just a few days ago.  But there seem to be some discrepancies between the described contents of the kit and what people were actually shipped.  The production quality looks to be good, and I was able to use mine.  Just no instructions, at this point.
  5. The Stage at KDHX: A local Kickstarter project for a radio station in  my area.  Everything went great!
  6. Supertoy - World's First Natural Talking Teddy Bear: An AI-enabled teddybear.  How could I pass this one up?  The video showed a teddy bear that could converse as well as Data from Star Trek. Clearly that did not happen, or you would have ready about the dawn of strong AI in all major media.  However, after over a year behind I did get a bear capable of looking up things in Wikipedia.  It is actually kind of cool, just a wild-ride of 80 somewhat strange updates of the project creator sharing YouTube videos that he found interesting.
  7. Dataflow & Reactive Programming Systems: A book project by a local author.  Project delivered what it promised in the
  8. Star Trek Continues Webseries: This is a really cool project if you are a fan of the original Star Trek.  Delivered exactly what I expected.
  9. Chaos Drift - A Nostalgic RPG Experience: A classic RPG style game from my hometown.  The project is somewhat behind but seems to be progressing well.
  10. KUNG FURY: A movie in the classic 1980's action style.  The project is a bit behind at this point, but seems to be progressing.
  11. Hello Ruby: A computer programming book for kids.  Huge Kickstarter success; however, is now behind a few months.  Looks promising.
  12. iOS App Development for Teens by Teens: A book about iOS app development targeted at teens, and written a teen.  Project delivered on-time and to expectations!
  13. JUMP Cable: A small device that can be used to recharge iPhones and other devices.  Currently shipping.
  14. Mini Museum: Totally amazing project that lets you own a miniature museum in an acrylic case.  This was one of my favorite Kickstarter projects.  Small delay, but I got mine yesterday.  Great communication with backers.
  15. A History of the Great Empires of Eve Online: A history of the MMOG Eve.  Seems to be progressing well.
  16. The Universim:  Looks like the ultimate successor to Civilizations, seems to be progressing well.
  17. Bring Reading Rainbow Back for Every Child, Everywhere!: I watched Reading Rainbow when younger, and am a major fan of the chief engineer of the Enterprise.  How could I not back LeVar.  Project was epic and seems to be on track.  Butterfly in the sky, this thing went twice as high, take a look, it hit 5 million!
  18. The PHD Movie 2: Still in Grad School:  Since entering PhD gradschool myself I've seen many references to these comics.  Project seems to be doing well!

There you have it!  These are the projects I backed on Kickstarter.  This is a total of $1041 in backing dollars.  So far it has been a great experience!

Multi Agent Modeling Presentation & Midterms

I co-presented on the topic of Agent Based Modeling (ABM) for the 2014 Society of Actuaries annual meeting in Orlando, FL during October 26-29, 2014.  October was a busy month for me.  I had to fly to Ft. Lauderdale and Orlando both in October.  I know! There are worse places to fly to, but it was still a somewhat hectic schedule.  I flew to Ft. Lauderdale first to attend my midterms for the computer science PhD program I am a student of.

october-travel-2014

 

One of the booths at the SOA meeting was setup to take green-screen photos and place the attendee on the cover of The Actuary.  Here is my photo:

jheaton_actuary

It was an interesting meeting.  There were keynote presentations by both Madeleine Albright, former USA secretary of state, and Dr. Adam Steltzner, Lead Landing Engineer of NASA’s Mars Science Laboratory Curiosity Rover Project.  Both keynote speakers were fascinating.  There were quite a few data science related sessions during the conference.

I presented, along with Dr. Anand S. Rao on ABM.  I began the talk with an overview of what ABM is and introduced the open source utility Repast. Dr. Rao continued the talk and showed how Price Waterhouse Cooper (PWC) makes use of agent modeling.   You can see the slides for the presentation here (link to SOA site).

I also had to get ready for midterms for my first semester.  I am enrolled in two courses, and had to study for Artificial Intelligence and Database Management Systems.

study-midterm-2014

I will post more about the semester once it concludes in a few weeks.  I am already signed up for Winter 2014, and will be taking Computer Graphics and Cryptography.

Roasting and Brining a Turkey

thanksgiving_done

There are many different ways to cook a turkey.  I've prepared turkeys for our family Thanksgiving meals for the last several years.  The results have been good enough that I am often asked about the exact process.  In this post I will cover the process.  Like many of my posts about artificial intelligence, mathematics and computer programming, this post is also for me to remember the process that I used.

I brine and then roast the turkey.  Brining is a process that I first learned about from a wine/food class at  Balaban's restaurant.  If you are ever in the St. Louis, MO area, I highly recommend Balaban's!  The first year we tried brining it was a hit.  Brining produces a very juicy, flavorful turkey!

Preparing the Turkey

I usually prepare two turkeys each Thanksgiving holiday.  My wife and I usually celebrate thanksgiving with her and my parts of the family on two consecutive days.  Because of this, I have two turkeys to thaw and brine.  This requires some planning.  I usually have a turkey schedule on the refrigerator leading up to the big day.

thanksgiving_schedule

I always use a frozen turkey, usually a Butterball.  We did try a fresh turkey one year; however, it did not seem to have a significantly different taste-- at least to us.  Also, because I am cooking two consecutive turkeys, it is difficult to acquire a turkey for the day after Thanksgiving.  I prefer buying two frozen turkeys a week or two before Thanksgiving.

Thawing

I usually buy a 20 pound bird, and therefore need to thaw for 5 days.  Butterball has a great calculator for this.  I find that at the end of 4 complete days my 20 pound turkey is nearly completely thawed.  At the beginning of the fifth day, I begin the brining process.  There are certain safety procedures you should always follow when cooking a turkey.  For precise instructions refer to the USDA.

Brining

To brine I need the following ingredients.  This process is based on my own experimentation, drawn from several recipes.  I use the following ingredients:

  • Salt
  • Half of a cup of brown sugar
  • Half of a cup of regular sugar
  • One onion (chopped)
  • 4 stalks of celery (chopped)
  • 4 carrots (chopped)
  • 1 egg (only used to measure)

I've always used the floating egg method to determine how much salt to use for the brine solution.  The amount of brine solution you will need will depend on the method that you use to submerge your turkey.  I always brine in the refrigerator, so I need several gallons of brine to fill my brining bag and place it inside the refrigerator.  I have a large vegetable drawer in my refrigerator that holds the brining bag perfectly.  I've also considered purchasing a large vat that would fit inside my refrigerator.  Whatever method you use, the turkey must be submerged in the brine solution for 24 hours.  Doing this outside of a refrigerator is not recommended.

  • Place several gallons of water in a large stockpot, as much brine as you need.
  • Add salt to the water until an egg floats.  (There are other ways to determine the amount of salt.  This method has worked well for me.)
  • Add half a cup of brown sugar, half a cup of regular sugar, and the chopped vegetables.
  • Bring the entire solution to a boil for 20 minutes
  • Let the solution cool overnight.  It takes awhile for this much boiling water to cool, so plan accordingly.
  • Do not submerge the turkey in a boiling/near boiling brine solution, as this would unsafely begin to cook the turkey.  Chill the brine to room temperature or below.
  • Remove vegetables from brine and discard.  Be careful handling any boiling liquid.
  • Unwrap turkey and remove any giblets/bags that are in the neck and/or body cavities.
  • Submerge the turkey in the brine for 24 hours.

Cooking

Once the brining of the turkey is complete cooking can begin.  I use the following ingredients for the actual cooking.

  • Brined turkey
  • Olive Oil (Extra Virgin)
  • Poultry Spice
  • Pepper
  • Apple
  • 3 Carrots
  • Rosemary

I do not like to cook stuffing inside the turkey.  This practice is considered unsafe by the USDA.  However, I do like to stuff the turkey with an apple, carrots and some rosemary to add some to the taste and aroma of the cooked turkey.

  • Remove turkey from brine solution and rinse.
  • Add quartered apple, carrots and rosemary to the turkey's body cavity.
  • Brush olive oil on the outside skin of the entire turkey.
  • Sprinkle pepper and poultry spice over the outside of the turkey.
  • Insert a thermometer into the thickets part of the thigh of the turkey (there are youtube vides that show where this is)
  • Cover turkey in foil for the first 1.5 hours of cooking.
  • Cook turkey until the thigh is 165 degrees. For precise instructions refer to the USDA.
  • Basting is not necessary but will result in more even browning of the turkey.
  • Once the thigh reaches 165 degrees, allow the turkey to sit for 30 minutes before serving.

This is what my turkey looked like just before cooking.

thanksgiving_ready

I use an electronic thermometer.  You can see the cord in the picture above and the actual themometer in the picture below.

thanksgiving_thrmometer

Serving

Here are some other recipes that I like for the Thanksgiving day meal.

 

Encog 3.3 for Java and C# Released

I just released Encog for Java and C#.  This includes deployment to Maven and NuGet.  The download pages are here:

This is primarily a maintenance release.  I do introduce EncogModel.  This will eventually replace the current Encog Analyst classes.  The idea is to make it much easier to switch model types and perform model selection.  Click here to see EncogModel in action.

This release also added the random number generators that were introduced in AIFH Vol 1.

 

Creating a Python 3.x Data Science Working Environment

The purpose of this post is to demonstrate how to get an effective "data science" environment up and running with Python 3.  This blog post will give a common set of instructions for my books, articles and other information provided by me.  Even if you are not reading one of my books or articles, you might find this information useful.

I feel that Python 3.x made some great strides towards source code clarity and binary efficiency.  To debate the relative pros and cons of Python 3.x vs 2.x is not the purpose of this posting.  The purpose of this posting is to document how to install, what I consider to be, a decent "data scientist" working environment with Python 3.x.  The primary purpose of this post is so that I do not forget how to do this!  However, the rest of the world might benefit as well.

First of all, if you do not need Python 3.x, then just install Anaconda and call it a day. Anachonda is a scientific distribution of Python 2.7 that will give you all that you need.

https://store.continuum.io/cshop/anaconda/

What do I mean by a "data scientist" environment?  In particular, I make use of the following packages:

  • Numpy - For numerical processing.
  • Scipy - For scientific processing not covered by Numpy.
  • Scikit-Learn - For machine learning.
  • Theano - For numerical processing not covered by Numpy and deep learning.
  • Matplotlib - For charting.
  • Pygame - For visualization.
  • Oracle - For database access.
  • (anything else needed by the above)

The first thing to realize about installing anything in Python is that you are dealing with "pure Python" and "binary" packages.  Pure Python packages can be installed with the "pip" command.  Most serious models are written in C, C++ or Fortran (yes I said Fortran, it is a serous data science language, even in 2014).  They would simply be too slow in pure Python.  All of the above packages are "binary", and must be installed in their own unique ways.

Another consideration is 64-bit or 32-bit.  This document assumes 64-bit!

Installing Python 3.x on Windows, Mac and Linux all present their own unique challenges.  I will eventually describe all three, however, for now, this post is a "work in progress."

Windows Installation

The first step is to install the latest version of Python, which can be found here. Make sure to download the latest 64-bit version of 3.x. You must normally go through an intricate process to compile a binary Python package.  The University of California at Irvine provides a repository of these packages, with Windows installers.  This will save you a great deal of time!!!

Install these packages, in this order!

  • Python 3.x, get it here
  • pygame, get it at UCI
  • numpy, get it at UCI
  • scipy, get it at UCI
  • six, Pure Python, install with "pip install six"
  • dateutil, Pure Python, install with "pip install dateutil"
  • pytz, Pure Python, install with "pip install pytz"
  • pyparsing, Pure Python, install with "pip install pyparsing"
  • matplotlib, get it at UCI
  • theano, get it at UCI

Using Oracle in 64-bit has its own set of issues, however, once its installed, it works fine.  I will add Oracle instructions soon.

Mac Installation

More to come

Linux Installation

More to come