Create an API Key
Since you'll be using
curl to send a request to the Speech API, you'll need to generate an API key to pass in our request URL.
To create an API key, click Navigation menu > APIs & services > Credentials:
Then click Create credentials:
In the drop down menu, select API key:
Copy the key you just generated.
Now that you have an API key, save it to an environment variable to avoid having to insert the value of your API key in each request. You can do this in Cloud Shell command line. In the following command, be sure to replace
<YOUR_API_KEY> with the key you just copied.export API_KEY=<YOUR_API_KEY>
Create your Speech API request
Create
request.json in Cloud Shell command line. You'll use this to build your request to the speech API:.touch request.json
Now open the
request.json using your preferred command line editor (nano, vim, emacs) or gcloud. Add the following to your request.json file, using the urivalue of the sample raw audio file:{
"config": {
"encoding":"FLAC",
"languageCode": "en-US"
},
"audio": {
"uri":"gs://cloud-samples-tests/speech/brooklyn.flac"
}
}
The request body has a
config and audio object.
In
config, you tell the Speech API how to process the request:- The
encodingparameter tells the API which type of audio encoding you're using while the file is being sent to the API.FLACis the encoding type for .raw files (here is documentation for encoding types for more details).
There are other parameters you can add to your
config object, but encoding is the only required one.
In the
audio object, you pass the API the uri of the audio file in Cloud Storage.
Now you're ready to call the Speech API!
Call the Speech API
Pass your request body, along with the API key environment variable, to the Speech API with the following
curl command (all in one single command line):curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json \
"https://speech.googleapis.com/v1/speech:recognize?key=${API_KEY}"
Your response should look something like this:
{
"results": [
{
"alternatives": [
{
"transcript": "how old is the Brooklyn Bridge",
"confidence": 0.98267895
}
]
}
]
}
The
transcript value will return the Speech API's text transcription of your audio file, and the confidence value indicates how sure the API is that it has accurately transcribed your audio.
You'll notice that you called the
syncrecognize method in the request above. The Speech API supports both synchronous and asynchronous speech to text transcription. In this example you sent it a complete audio file, but you can also use the syncrecognize method to perform streaming speech to text transcription while the user is still speaking.
You created an Speech API request then called the Speech API.
沒有留言:
張貼留言