3 steps to create Financial Chatbot powered by ChatGPT Part 1

rockingdingo 2023-11-12 #Financial Chatbot # ChatGPT # Stock Price


3 steps to create Financial Chatbot powered by ChatGPT Part 1

Summary

In this blog, we will show you 3 easy steps to create your personal financial chatbot assistant powered by ChatGPT. ChatGPT is Large Language Model (LLM) based Artificial Intelligent service created by OpenAI company, who just released their latest Assistant API for chatbot creation. These AI assistants are very useful in financial industries. Common use cases include: Generating realtime stock price quotes, Analyzing financial data, Providing investment advices, generating summary of financial reports, etc.
Keywords: Financial Chatbot, Financial ChatGPT, Chatbot Stock Price, Chatbot NYSE, Chatbot.
Since financial data are usually realtime streaming data such as the realtime stock price, we are applying the Retrieval Augmented generation (RAG) techniques to help generating meaning response as contexts in three steps:
1. Parsing User Intent from Question (Query)
2. Retrieve Realtime Stock Price from NYSE, NASDAQ, HKEX website
3. Use ChatGPT to generate human language response from retrieved structured data.
In the following, we will show the detailed steps to create Financial Chatbot in Python language. And you can also visit the Chatbot demo on the deepnlp.org website, click the "AI Assistant button" and chat input "Stock price of Tesla and Google".

Navigation

  • 1. Parsing User Intent from Question (Query)

    Suppose user input query is "Stock Price of TESLA and GOOGLE", you want to parse user's intent from this query, which is finding latest stock price of Tesla(TSLA) and Google (GOOG) stocks. We can use the Trie data structure to efficiently store the list of all stock names crawled from NASDAQ and NYSE website, such as Google(GOOG), Apple(AAPL), Tesla (TSLA), etc. Then you can build a trie tree as in the following python code and efficiently parse user's financial intent from the query. The parsing result is shown below:


    Python Code

    #coding=utf-8
    #!/usr/bin/python
    
    import requests
    import datetime
    import time
    
    
    class Node(object):
        def __init__(self, value):
            self._children = {}
            self._value = value
            self._terminal = False
    
        def _add_child(self, char, value, overwrite=False, if_terminal=False):
            child = self._children.get(char)
            if child is None:
                child = Node(value)
                if if_terminal:
                    child._terminal = True
                self._children[char] = child
    
            if if_terminal:
                child._terminal = True
            if overwrite:
                child._value = value
            return child
    
    class Trie(Node):
        """ 
        """
    
        def __init__(self):
            super(Trie, self).__init__(None)
    
        def __contains__(self, key):
            return self[key] is not None
    
        def __getitem__(self, key):
            state = self
            for char in key:
                state = state._children.get(char)
                if state is None:
                    return None
            return state._value
    
        def get_prefix_full_value(self, key):
            full_value_list = []
            state = self
            for char in key:
                state = state._children.get(char)
                if state is None:
                    break
            # print ("DEBUG: get_prefix_full_value state is:" + str(state))
            if state is None:
                return None
            else:
                if state._terminal:
                    full_value_list.append(state._value)
                full_value_list.extend(self.traverse_childen(node=state))
                #print ("DEBUG: get_prefix_full_value state cur Node childern is:" + str(full_value_list))
            full_value_list_unique = list(set(full_value_list))
            full_value_list_sorted = sorted(full_value_list_unique, key=lambda x:len(x))
            return full_value_list_sorted
    
        def traverse_childen_all(self, node):
            """ 
            """
            state = node
            print ("DEBUG: traverse_childen cur State value|%s|terminal|%s" % (state._value, state._terminal))
            for (key, child) in state._children.items():
                if child is not None:
                    ## child is not None and child is terminal child
                    print ("DEBUG: traverse_childen child value|%s|terminal|%s" % (child._value, child._terminal))
                    self.traverse_childen_all(node = child)
    
        def traverse_childen(self, node):
            """ 
            """
            full_value_list = []
            state = node
            if state._terminal and len(state._children) > 0:
                full_value_list.extend([state._value])
            for (key, child) in state._children.items():
                ## child is not None and child is terminal child
                if child is not None:
                    # print ("DEBUG: traverse_childen child value|%s|terminal|%s" % (child._value, child._terminal))
                    if child._terminal:
                        full_value_list.extend([child._value])
                    full_value_list.extend(self.traverse_childen(node=child))
                else:
                    print ("DEBUG: traverse_childen child is None child|%s" % str(child))
            return full_value_list
    
        def __setitem__(self, key):
            state = self
            for i, char in enumerate(key):
                if i < len(key) - 1:
                    partial_key = key[0:i+1]
                    state = state._add_child(char, partial_key, False, if_terminal = False)
                else:
                    state = state._add_child(char, key, True, if_terminal = True)
    
    def test_trie_node():
        input_query_list = ["GOOG", "GOOGLE", "APPLE", "TESLA", "BOEING"]
        trie_tree = Trie()
        for query in input_query_list:
            trie_tree.__setitem__(query)
        ## 
        print ("DEBUG: Traverse G")    
        print (trie_tree["G"])
        print ("DEBUG: Traverse G")
        print (trie_tree.get_prefix_full_value("GOOG"))
        print ("DEBUG: Traverse A")
        print (trie_tree.get_prefix_full_value("A"))
        print ("DEBUG: Traverse TESLA")
        print (trie_tree.get_prefix_full_value("TES"))
    
    
    def main():
    
        query = "Stock Price of TESLA and GOOGLE"
    
        # Construct Trie Prefix
        input_stock_name_list = ["GOOG", "GOOGLE", "APPLE", "TESLA", "BOEING"]
        trie_tree = Trie()
        for name in input_stock_name_list:
            trie_tree.__setitem__(name)
    
        words = query.split(" ")
        parse_stock_name_list = []
        for word in words:
            if trie_tree[word] is not None:
                parse_stock_name_list.append(word)
    
        print ("DEBUG: Input Query|%s" % query)
        print ("DEBUG: Final Parsed Stock Quotes from Query|%s" % (str(parse_stock_name_list)))
    
        ## Result:
        # DEBUG: Final Parsed Stock Quotes from Query|TESLA and GOOGLE stock price|['TESLA', 'GOOGLE']
    
    if __name__ == '__main__':
        main()
    
            
  • Retrieve Realtime Stock Price from NYSE, NASDAQ, HKEX website

    The second step is to retrieve realtime stock price from NYSE, NASDAQ or HKEX website. Let's use Tesla stock as an example. You can visit the official NASDAQ or NYSE website for Tesla stock price as: https://www.nasdaq.com/market-activity/stocks/tsla

    Python Code

    
    def get_quote_from_nasdaq(quote):
        """
           quote: tsla
        """
        equity_data = {}
        equity_data["quote"] = quote
        try:
            url = "https://www.nasdaq.com/market-activity/stocks/%s" % quote.lower()
            headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36'}
            # house = 'https://www.hkex.com.hk/?sc_lang=EN'
            res = requests.get(url, headers=headers)
            soup = BeautifulSoup(res.text, 'html.parser')
    
            ## result processing from the soup result
            # equity_data=...
    
    
        except Exception as e:
            print ("DEBUG: get_quote_from_nasdaq get quote failed...")
            print (e)
        return equity_data
    
            
  • Use ChatGPT to generate human language response from retrieved structured data.

    ChatGPT and other AI Assistant such as perplexity.ai can analyze financial data and provide financial advice. You can give the realtime quote information to ChatGPT as prompt. It will generates human-language summary as response. Also you can ask related questions such as "What's the market capitalization given current stock price?", "Is tesla a buy or sell stock given current price?". Some example prompts are shown below.

    Example prompt 1


    Stock price quote for Tesla is high/low: $215.38/$205.69, 1 Year Target price is $250.00. Is Tesla a buy or sell stock given current price?

    Result 1


    Based on the search results, the consensus rating for Tesla stock is "Buy" with 48 buy ratings, 36 holding ratings, and 9 sell ratings. However, it's important to note stock market is unpredictable and subject to fluctuation.

    Example prompt 2


    What's Tesla's market capitalization given current stock price?

    Result 2


    According to search results, the market capitalization varies depending on the dates and sources. On November 23, 2023, the market capitalization of Tesla is $659.09 billon with 3.18 billon shares outstanding. It is important to note these values are subject to change due to fluctuation.


    Python Code