2019年5月27日 星期一

Google Cloud Speech API: Qwik Start — Google Cloud Platform GCP 實際操作實習手冊

Create an API Key

Since you'll be using curl to send a request to the Speech API, you'll need to generate an API key to pass in our request URL.
To create an API key, click Navigation menu > APIs & services > Credentials:
b17ba9d53f88aab6.png
Then click Create credentials:
168581e4ae32f076.png
In the drop down menu, select API key:
bc4940935c1bef7f.png
Copy the key you just generated.
Now that you have an API key, save it to an environment variable to avoid having to insert the value of your API key in each request. You can do this in Cloud Shell command line. In the following command, be sure to replace <YOUR_API_KEY> with the key you just copied.
export API_KEY=<YOUR_API_KEY>

Create your Speech API request



Create request.json in Cloud Shell command line. You'll use this to build your request to the speech API:.
touch request.json
Now open the request.json using your preferred command line editor (nanovimemacs) or gcloud. Add the following to your request.json file, using the urivalue of the sample raw audio file:
{
  "config": {
      "encoding":"FLAC",
      "languageCode": "en-US"
  },
  "audio": {
      "uri":"gs://cloud-samples-tests/speech/brooklyn.flac"
  }
}
The request body has a config and audio object.
In config, you tell the Speech API how to process the request:
  • The encoding parameter tells the API which type of audio encoding you're using while the file is being sent to the API. FLAC is the encoding type for .raw files (here is documentation for encoding types for more details).
There are other parameters you can add to your config object, but encoding is the only required one.
In the audio object, you pass the API the uri of the audio file in Cloud Storage.
Now you're ready to call the Speech API!

Call the Speech API

Pass your request body, along with the API key environment variable, to the Speech API with the following curl command (all in one single command line):
curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json \
"https://speech.googleapis.com/v1/speech:recognize?key=${API_KEY}"
Your response should look something like this:
{
  "results": [
    {
      "alternatives": [
        {
          "transcript": "how old is the Brooklyn Bridge",
          "confidence": 0.98267895
        }
      ]
    }
  ]
}
The transcript value will return the Speech API's text transcription of your audio file, and the confidence value indicates how sure the API is that it has accurately transcribed your audio.
You'll notice that you called the syncrecognize method in the request above. The Speech API supports both synchronous and asynchronous speech to text transcription. In this example you sent it a complete audio file, but you can also use the syncrecognize method to perform streaming speech to text transcription while the user is still speaking.
You created an Speech API request then called the Speech API.

沒有留言:

張貼留言