Make a Filter with GPT-3 Semantic Search

Make a Filter with GPT-3 Semantic Search

For some GPT-3 completion tasks there might be cases where you want to filter text sent from or sent to an end user. While the OpenAI API has a content filter that can be used to help flag outputs that might be considered inappropriate, you can create a more specific filter using Semantic Search to help keep conversations in the direction you intend. This could be making sure that both text sent to and generated by the API relates to a specific topic or creating a filter that’s able to flag content with more precision.

Before getting started, it’s important to point out that content filtering is an extremely challenging problem. And while a content filter built with Semantic Search can be useful, there’s no guarantee that it will work in every case. You should extensively test a filter for deployment to get an idea of how effective it will be in real use.

This is an interactive tutorial. Each step will explain the broad concept and the code used to perform the function. You can input your own examples to experiment with the application.

The code here is meant to illustrate how you would create the specific functions and the theory behind them. It uses very basic Javascript functions that can be easily converted to other languages. You may need to do additional coding to implement your own filter and ideally perform the functions server-side.

In this example we’ll build a filter that flags if an input or output is dog related.

Step 1: The user submits a question or the API returns a result

            HTML example

            <!-- Textarea question input -->

            <textarea id="inputQuestion">How do I stop my puppy from walking on the ceiling?</textarea>

Step 2: Use Semantic Search to give scores to a list of text blocks relating to the topic

The most important part of the filter is the list of text blocks you use to generate the scores. If we want to see how dog-related the input is, we can create a simple list that says “dog”, “cat”, “cats and dogs” and “no cats or dogs.” For more sophisticated queries you’ll want to provide more examples and a function to measure how like or unlike you input is.

In this example we use a list of text blocks separated by “###” that we will then split into an array to send to the Semantic Search API along with the query.

Javascript and jQuery


// This is the API request for Semantic Search


// This variable will hold the different examples from the textarea.
// In this example the text in the textarea above will be imported and 
// divided into an array using the "###" tag as the split indicator.
var documents;
 
// This variable will hold the scores for each section returned by the Semantic Search API
var scores;


function apiScore(){

    // This splits the text in the textarea into a list of text blocks and removes surrounding white space
    documents = inputDocuments.value.split("###").map(Function.prototype.call, String.prototype.trim);
    
    // Clear the textarea
    inputDocuments.innerHTML = "";
    
    // The engine to use for Semantic Search
    var selectedEngine = "babbage"

    // The information we're sending to the API: An array of documents and the question as the query
    var _data = {
                    "documents": documents,
                    "query": inputQuestion.value
                };

    // The header for the request
    var _headers = {
        'Content-Type': 'application/json',
        'Accept': 'application/json',
        'Authorization': `Bearer ${apiKeyInput.value}`
    };
    
    // Our jquery request
    $.ajax({
    type:'POST',
    url: `https://api.openai.com/v1/engines/${selectedEngine}/search`,
    dataType:'JSON',
    headers: _headers,
    data: JSON.stringify(_data),
        success:function(result){

            // Assigning the result data to the scores variable to be used later
            scores = result.data;
    
            // Displaying the results in the document textarea
            result.data.forEach(element => {
                var _doc = documents[element.document];
                var _score = "Document: " + element.document + "\nScore: " + element.score + "\n----------------\n" + _doc + "\n\n";
                inputDocuments.innerHTML += _score;
            });

        }
    });

    }

}

Step 3: Sort and return the top scoring match

We next use a simple Javascript function to sort the list of text blocks by their score returned to us by Semantic Search. This shows us which ones scored the highest and the lowest.

For our dog topic detector we’ll just use the highest scoring result to determine if it’s about dogs or not. For more complex examples you might average the results of top scoring text blocks and compare them to lowest scoring or some other method to determine how relevant the input is to what you want to accomplish.

In our code we sort the list of scores and text block indices returned by Semantic Search and then present the highest scoring text block.

In advanced use cases you would keep a dictionary of text blocks and the label you want to assign for them and then return the label instead of the actual text block.

Best match...

Javascript and jQuery


// This function will sort through the returned list of scores and
// rank them from highest to lowest and display them in a div.

// This variable will hold our highest scoring document.
// You can also use an array to hold more than one high scoring document 
var highestScoredDocument = "";

function sortDocs(){

    // This is the div that we want to display the ranked sections
    rankedDocuments.innerHTML = "";

    // This function orders the scores array of document scores based on the "score" attribute of document
    scores.sort(function (x, y) {
        return y.score - x.score;
    });

    // This iterates through the scores and displays them in the rankedDocuments div
    scores.forEach((score)=>{

        // The scores array only stores the score and the index of each document.
        // To retrieve the actual document text we use the scores document 
        // index (score.document) to find the matching document. 
        var _doc =  documents[score.document]

        var _score = "Score: " + score.score + "\n----------------\n" + _doc + "\n";
        rankedDocuments.innerHTML += _score;

        // This stores the first score (the highest) and stores it to the highestScoredDocument variable
        if(highestScoredDocument == ""){
            highestScoredDocument = _doc;

            // This displays our top-ranking result
            topScoreResult.innerHTML = `<h2>${_doc}</h2>`;
        }
        
    });

        
}

Experimenting more with Semantic Search filtering

You can try the different examples in the drop down selector in Step 1, add your own input and create your own text blocks for Semantic Search to compare them with. You can also experiment with the different engines to see how they compare.

Tutorial: Make a Filter with GPT-3 Semantic Search

This is an interactive tutorial. Each step will explain the broad concept and the code used to perform the function. You can input your own examples to experiment with the application.

Best match...