Chatterbot (QBASIC)

From LiteratePrograms
Jump to: navigation, search

[edit] A simple Chatterbot example

In principle a chatterbot can be a very short program. For instance the following program implements a chatterbot which will learn phrases in any language by repetition in much the same way that a parrot does.

The first section just calls a set of routines in order. The routines have been named to make their function reasonably obvious. Don't let the GOSUBs worry you. These are just procedures with access to the global variables and every variable in this short program is global.

<<Main>>=
QChatt:
  DEFINT A-Z
  GOSUB Initialise
  GOSUB LoadData
  GOSUB Converse
  GOSUB StoreData
  SYSTEM

First we set up a couple of items. This chatbot works by keeping track of the last six (or whatever) characters and using them to predict the next character. So we need to say how many to track, using the variable ContextLength, and use the array, Context$, to store the resulting text fragments together with the array, Alternatives$, to keep track of characters which can follow a particular text fragment. The DictionaryFile$ variable specifies the name of a file which can be used as permanent storage of the results of sessions to date.

The purpose of RANDOMIZE TIMER is just to ensure that the random number generator has been properly seeded, so that the RND calls later in the program produce "unexpected" results.

<<Initialise>>=
Initialise:
  LET DictionarySize = 1000
  DIM Context$(DictionarySize) 'The character sequences that QChatt has already seen
  DIM Alternatives$(DictionarySize) 'The characters that QChatt may print after recognising a sequence.
  LET EmptyRow = 0
  LET EndOfResponseCharacter$ = CHR$(180)
  LET ContextLength = 6  'A bigger value makes QChatt more grammatical but slower learning.
  LET CurrentContext$ = STRING$(ContextLength, EndOfResponseCharacter$)
  LET DictionaryFile$ = "QCHATT.MEM"
  RANDOMIZE TIMER
  RETURN

The Converse routine runs the top level sentence/response loop. It passes anything that the Human says to an analysis routine, MemoriseHumanResponse, which picks it apart and uses it as raw material for future possible computer responses. It then prints a Computer response created by the GenerateComputerResponse routine which uses the results of previous analyses if any of them are appropriate. If not it just waits for more Human responses. When it doesn't get any it stops.

<<Converse>>=
Converse:
  DO
    LINE INPUT "Human: "; Response$
    IF Response$ = "" THEN EXIT DO
    LET Response$ = Response$ + EndOfResponseCharacter$
    GOSUB MemoriseHumanResponse
    LET Response$ = ""
    GOSUB GenerateComputerResponse
    PRINT "Computer: "; Response$
  LOOP
  RETURN

So how does the computer analyse the Human responses? Well it uses the CurrentContext$ variable to keep track of the last few characters, however many are specified by ContextLength, by running through the Human's response character by character and adding it to the right end while dropping whichever character is on the left end of CurrentContext$. To begin with CurrentContext$ normally has the last few characters of the Computer response. This is what links the responses together so that the GenerateComputerResponse routine has a chance of coming up with something halfway sensible for a given Human sentence.

The character in the CurrentCharacter$ variable is basically what should be predicted by the CurrentContext$ text fragment in the GenerateComputerResponse procedure, so the InsertCharacter procedure records the fact that this particular CurrentCharacter$ can follow that CurrentContext$.

<<MemoriseHumanResponse>>=
MemoriseHumanResponse:
  DO WHILE Response$ > ""
    LET CurrentCharacter$ = LEFT$(Response$, 1)
    LET Response$ = MID$(Response$, 2)
    GOSUB InsertCharacter
    LET CurrentContext$ = MID$(CurrentContext$, 2) + CurrentCharacter$
  LOOP
  RETURN

So now that we know how to analyse the text, how do we use the results to generate the Computer response? Well we basically use the CurrentContext$ variable in the same way that we did for the analysis. The difference is that we use it to retrieve the possible following characters stored in Alternative$, instead of recording the current character from the Human response. When we retrieve the alternatives, we may find that there is more than one possible character that can follow this particular CurrentContext$ text fragment. If so, we pick one at random. We may also find that a particular CurrentContext$ text fragment doesn't have any following characters recorded in Alternatives$. In that case we just stop the generation and return the result.

<<GenerateComputerResponse>>=
GenerateComputerResponse:
   DO
     GOSUB Lookup
     LET CurrentCharacter$ = MID$(Alternatives$(DictionaryIndex), INT(RND * LEN(Alternatives$(DictionaryIndex))) + 1, 1)
     IF CurrentCharacter$ = "" THEN
       EXIT DO
     ELSE
       LET CurrentContext$ = MID$(CurrentContext$, 2) + CurrentCharacter$
       IF CurrentCharacter$ = EndOfResponseCharacter$ THEN
         EXIT DO
       ELSE
         LET Response$ = Response$ + CurrentCharacter$
       END IF
     END IF
   LOOP
   RETURN

Now for a few utility routines. Firstly the InsertCharacter routine. It's called that because it records a new possible following character for a particular CurrentContext$ text fragment but only if it hasn't already been recorded. To ensure that it adds the character to the correct set of Alternatives$, the Lookup procedure converts the CurrentContext$ text fragment into an array index stored in the DictionaryIndex variable.

<<InsertCharacter>>=
InsertCharacter:
  GOSUB Lookup
  IF INSTR(Alternatives$(DictionaryIndex), CurrentCharacter$) = 0 THEN
    LET Alternatives$(DictionaryIndex) = Alternatives$(DictionaryIndex) + CurrentCharacter$
  END IF
  RETURN

In this simple version of the QChatt program the Lookup routine is implemented as a linear lookup. The CurrentContext$ text fragment is compared against all the stored Context$ and if a matching one is found, its DictionaryIndex is returned. This is the least efficient part of the program. In other versions of QChatt I have replaced it with a hash array or with a Trie which allows much larger amounts of data to be handled. However in this article I have gone for simplicity over efficiency.

<<Lookup>>=
Lookup:
  LET Context$(EmptyRow) = CurrentContext$
  LET DictionaryIndex = 0
  DO WHILE CurrentContext$ <> Context$(DictionaryIndex)
    LET DictionaryIndex = DictionaryIndex + 1
  LOOP
  IF DictionaryIndex = EmptyRow AND DictionaryIndex < DictionarySize THEN
    LET Alternatives$(EmptyRow) = ""
    LET EmptyRow = DictionaryIndex + 1
  END IF
  RETURN

The LoadData and StoreData routines just do what it says on the tin. They are included to make sure that QChatt doesn't forget everything when you close the program down.

<<LoadAndStoreData>>=
LoadData:
  OPEN DictionaryFile$ FOR APPEND AS #1
  CLOSE #1
  OPEN DictionaryFile$ FOR INPUT AS #1
  DO WHILE EmptyRow < DictionarySize AND NOT EOF(1)
    LINE INPUT #1, Context$(EmptyRow)
    LINE INPUT #1, Alternatives$(EmptyRow)
    LET EmptyRow = EmptyRow + 1
  LOOP
  CLOSE #1
  RETURN

StoreData:
  OPEN DictionaryFile$ FOR OUTPUT AS #1
  FOR DictionaryIndex = 0 TO EmptyRow - 1
    PRINT #1, Context$(DictionaryIndex)
    PRINT #1, Alternatives$(DictionaryIndex)
  NEXT
  CLOSE #1
  RETURN

So putting it all together as ...

<<QChatt.bas>>=
Main
Initialise
Converse
MemoriseHumanResponse
GenerateComputerResponse
InsertCharacter
Lookup
LoadAndStoreData

... we end up with a program which can be interpreted by QBASIC or compiled by QuickBASIC 4.5 or VBDOS 1.

[edit] QChatt learning sessions

Note that to begin with, this chatterbot knows nothing and therefore says nothing. However if one uses simple conversation with it, like a parrot it will begin to reply as it starts to find responses that are appropriate for the immediately preceding sentence.

The following sessions show how to teach QChatt some simple conversation. Note that sessions are started by running the QChatt program and ended by pressing the return key without giving a response to the computer. They show QChatt being taught a basic conversational pattern in English.

(Session 1)

Human: hello there
Computer:
Human: hello
Computer:
Human: how are you
Computer:
Human: fine thanks how are you
Computer: fine thanks how are you
Human: fine thanks
Computer:
Human:

(Session 2)

Human: hello
Computer: how are you
Human: fine thanks how are you
Computer: fine thanks
Human: glad to hear it
Computer:
Human:

(Session 3)

Human: hello
Computer: how are you
Human: fine thanks
Computer: glad to hear it
Human:

The responses will be stored for the next session, so that there is no need to reteach them but it should be noted that while this example chatterbot will create "original" sentences after a little teaching, it has a very limited capacity and works best with short sentences.

Download code
hijacker
hijacker
hijacker
hijacker