Azure OpenAI binding spec

Detailed documentation on the Azure OpenAI binding component

Component format

To setup an Azure OpenAI binding create a component of type bindings.azure.openai. See this guide on how to create and apply a binding configuration. See this for the documentation for Azure OpenAI Service.

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: <NAME>
spec:
  type: bindings.azure.openai
  version: v1
  metadata:
  - name: apiKey # Required
    value: "1234567890abcdef"
  - name: endpoint # Required
    value: "https://myopenai.openai.azure.com"

Spec metadata fields

Field Required Binding support Details Example
endpoint Y Output Azure OpenAI service endpoint URL. "https://myopenai.openai.azure.com"
apiKey Y* Output The access key of the Azure OpenAI service. Only required when not using Microsoft Entra ID authentication. "1234567890abcdef"
azureTenantId Y* Input The tenant ID of the Azure OpenAI resource. Only required when apiKey is not provided. "tenentID"
azureClientId Y* Input The client ID that should be used by the binding to create or update the Azure OpenAI Subscription and to authenticate incoming messages. Only required when apiKey is not provided. "clientId"
azureClientSecret Y* Input The client secret that should be used by the binding to create or update the Azure OpenAI Subscription and to authenticate incoming messages. Only required when apiKey is not provided. "clientSecret"

Microsoft Entra ID authentication

The Azure OpenAI binding component supports authentication using all Microsoft Entra ID mechanisms. For further information and the relevant component metadata fields to provide depending on the choice of Microsoft Entra ID authentication mechanism, see the docs for authenticating to Azure.

Example Configuration

apiVersion: dapr.io/v1alpha1
kind: component
metadata:
  name: <NAME>
spec:
  type: bindings.azure.openai
  version: v1
  metadata:
  - name: endpoint
    value: "https://myopenai.openai.azure.com"
  - name: azureTenantId
    value: "***"
  - name: azureClientId
    value: "***"
  - name: azureClientSecret
    value: "***"

Binding support

This component supports output binding with the following operations:

Completion API

To call the completion API with a prompt, invoke the Azure OpenAI binding with a POST method and the following JSON body:

{
  "operation": "completion",
  "data": {
    "deploymentId": "my-model",
    "prompt": "A dog is",
    "maxTokens":5
    }
}

The data parameters are:

  • deploymentId - string that specifies the model deployment ID to use.
  • prompt - string that specifies the prompt to generate completions for.
  • maxTokens - (optional) defines the max number of tokens to generate. Defaults to 16 for completion API.
  • temperature - (optional) defines the sampling temperature between 0 and 2. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused and deterministic. Defaults to 1.0 for completion API.
  • topP - (optional) defines the sampling temperature. Defaults to 1.0 for completion API.
  • n - (optional) defines the number of completions to generate. Defaults to 1 for completion API.
  • presencePenalty - (optional) Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics. Defaults to 0.0 for completion API.
  • frequencyPenalty - (optional) Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. Defaults to 0.0 for completion API.

Read more about the importance and usage of these parameters in the Azure OpenAI API documentation.

Examples


curl -d '{ "data": {"deploymentId: "my-model" , "prompt": "A dog is ", "maxTokens":15}, "operation": "completion" }' \
      http://localhost:<dapr-port>/v1.0/bindings/<binding-name>

Response

The response body contains the following JSON:

[
  {
    "finish_reason": "length",
    "index": 0,
    "text": " a pig in a dress.\n\nSun, Oct 20, 2013"
  },
  {
    "finish_reason": "length",
    "index": 1,
    "text": " the only thing on earth that loves you\n\nmore than he loves himself.\"\n\n"
  }
]

Chat Completion API

To perform a chat-completion operation, invoke the Azure OpenAI binding with a POST method and the following JSON body:

{
    "operation": "chat-completion",
    "data": {
        "deploymentId": "my-model",
        "messages": [
            {
                "role": "system",
                "message": "You are a bot that gives really short replies"
            },
            {
                "role": "user",
                "message": "Tell me a joke"
            }
        ],
        "n": 2,
        "maxTokens": 30,
        "temperature": 1.2
    }
}

The data parameters are:

  • deploymentId - string that specifies the model deployment ID to use.
  • messages - array of messages that will be used to generate chat completions. Each message is of the form:
    • role - string that specifies the role of the message. Can be either user, system or assistant.
    • message - string that specifies the conversation message for the role.
  • maxTokens - (optional) defines the max number of tokens to generate. Defaults to 16 for the chat completion API.
  • temperature - (optional) defines the sampling temperature between 0 and 2. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused and deterministic. Defaults to 1.0 for the chat completion API.
  • topP - (optional) defines the sampling temperature. Defaults to 1.0 for the chat completion API.
  • n - (optional) defines the number of completions to generate. Defaults to 1 for the chat completion API.
  • presencePenalty - (optional) Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics. Defaults to 0.0 for the chat completion API.
  • frequencyPenalty - (optional) Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. Defaults to 0.0 for the chat completion API.

Example


curl -d '{
  "data": {
      "deploymentId": "my-model",
      "messages": [
          {
              "role": "system",
              "message": "You are a bot that gives really short replies"
          },
          {
              "role": "user",
              "message": "Tell me a joke"
          }
      ],
      "n": 2,
      "maxTokens": 30,
      "temperature": 1.2
  },
  "operation": "chat-completion"
}' \
http://localhost:<dapr-port>/v1.0/bindings/<binding-name>

Response

The response body contains the following JSON:

[
  {
    "finish_reason": "stop",
    "index": 0,
    "message": {
      "content": "Why was the math book sad? Because it had too many problems.",
      "role": "assistant"
    }
  },
  {
    "finish_reason": "stop",
    "index": 1,
    "message": {
      "content": "Why did the tomato turn red? Because it saw the salad dressing!",
      "role": "assistant"
    }
  }
]

Get Embedding API

The get-embedding operation returns a vector representation of a given input that can be easily consumed by machine learning models and other algorithms. To perform a get-embedding operation, invoke the Azure OpenAI binding with a POST method and the following JSON body:

{
    "operation": "get-embedding",
    "data": {
        "deploymentId": "my-model",
        "message": "The capital of France is Paris."
    }
}

The data parameters are:

  • deploymentId - string that specifies the model deployment ID to use.
  • message - string that specifies the text to embed.

Example


curl -d '{
  "data": {
      "deploymentId": "embeddings",
      "message": "The capital of France is Paris."
  },
  "operation": "get-embedding"
}' \
http://localhost:<dapr-port>/v1.0/bindings/<binding-name>

Response

The response body contains the following JSON:

[0.018574921,-0.00023652936,-0.0057790717,.... (1536 floats total for ada)]

Learn more about the Azure OpenAI output binding

Watch the following Community Call presentation to learn more about the Azure OpenAI output binding.


Last modified October 11, 2024: Fixed typo (#4389) (fe17926)