Create an API Key
Since you'll be using
curl
to send a request to the Speech API, you'll need to generate an API key to pass in our request URL.
To create an API key, click Navigation menu > APIs & services > Credentials:
Then click Create credentials:
In the drop down menu, select API key:
Copy the key you just generated.
Now that you have an API key, save it to an environment variable to avoid having to insert the value of your API key in each request. You can do this in Cloud Shell command line. In the following command, be sure to replace
<YOUR_API_KEY>
with the key you just copied.export API_KEY=<YOUR_API_KEY>
Create your Speech API request
Create
request.json
in Cloud Shell command line. You'll use this to build your request to the speech API:.touch request.json
Now open the
request.json
using your preferred command line editor (nano
, vim
, emacs
) or gcloud
. Add the following to your request.json
file, using the uri
value of the sample raw audio file:{
"config": {
"encoding":"FLAC",
"languageCode": "en-US"
},
"audio": {
"uri":"gs://cloud-samples-tests/speech/brooklyn.flac"
}
}
The request body has a
config
and audio
object.
In
config
, you tell the Speech API how to process the request:- The
encoding
parameter tells the API which type of audio encoding you're using while the file is being sent to the API.FLAC
is the encoding type for .raw files (here is documentation for encoding types for more details).
There are other parameters you can add to your
config
object, but encoding
is the only required one.
In the
audio
object, you pass the API the uri of the audio file in Cloud Storage.
Now you're ready to call the Speech API!
Call the Speech API
Pass your request body, along with the API key environment variable, to the Speech API with the following
curl
command (all in one single command line):curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json \
"https://speech.googleapis.com/v1/speech:recognize?key=${API_KEY}"
Your response should look something like this:
{
"results": [
{
"alternatives": [
{
"transcript": "how old is the Brooklyn Bridge",
"confidence": 0.98267895
}
]
}
]
}
The
transcript
value will return the Speech API's text transcription of your audio file, and the confidence
value indicates how sure the API is that it has accurately transcribed your audio.
You'll notice that you called the
syncrecognize
method in the request above. The Speech API supports both synchronous and asynchronous speech to text transcription. In this example you sent it a complete audio file, but you can also use the syncrecognize
method to perform streaming speech to text transcription while the user is still speaking.
You created an Speech API request then called the Speech API.
沒有留言:
張貼留言