Programming | Python | Query

Avoiding GCF Anti-Patterns Part 5: How To Run Background Processes Correctly In Python

Editor’s note: Over the past several weeks, we’ve posted a series of blog posts focusing on best practices for writing Google Cloud Functions based on common questions or misconceptions as seen by the Support team.  We refer to these as “anti-patterns” and offer you ways to avoid them.  This article is the fifth post in the series.

Scenario

You see finished with status: 'timeout' in the logs before a background process has completed in your Python Function. 

Most common root issue

Although this timeout error can happen for Functions using any runtime, we most often see this issue occur when Python developers try to use os.fork() or  multiprocessing.Process() in their Cloud Function.

Why you should try to avoid async work in a Function:

A background task started by a Cloud Function is not guaranteed to complete. As soon as the Functions completes, e.g. the Function returns or a timeout error occurs, the Function instance can be terminated at any time. You can read more about the Function execution timeline in the documentation.

We often see customers test their functions locally where these execution timeouts do not exist. Additionally, customers’ local machines may be more powerful than what they have provisioned for their Cloud Functions. Customers may see these multiprocessing scenarios working locally and therefore assume their code will work in the same way in the Cloud Function instance.  

For Python developers who require such async operations, we suggest using Cloud Tasks Service instead to schedule the background operation. See example below.

Using Cloud Tasks in a Python Cloud Function

The following Function demonstrates how you can use Cloud Tasks to schedule an async operation. This example shows a Cloud Function (named “create_task“) that creates a Cloud Task to invoke another Cloud Function that will run the background task. You can learn more about creating HTTP target tasks.

"""Create a task for a given queue with an arbitrary payload."""
from flask import escape
from google.cloud import tasks_v2
from google.protobuf import timestamp_pb2
import datetime
import json
 
 
def create_task(request):
 
   # Create a client.
   client = tasks_v2.CloudTasksClient()
 
   # TODO(developer): Uncomment these lines and replace with your values.
   project = '<PROJECT_ID>'
   queue = 'my-queue'
   location = '<your-region>'
   url = '<url-to-Cloud-Function-to-run-background-task>'
   payload = {'message': 'Hello from Cloud Tasks'}
   in_seconds = 60
 
   # Construct the fully qualified queue name.
   parent = client.queue_path(project, location, queue)
 
   # Construct the request body.
   task = {
       "http_request": {  # Specify the type of request.
           "http_method": tasks_v2.HttpMethod.POST,
           "url": url,  # The full url path that the task will be sent to.
           "oidc_token": {
               "service_account_email": "<your-service-account-with-function-invoker-role>@<PROJECT_ID>.iam.gserviceaccount.com",
               "audience": url
           }
       }
   }
   if payload is not None:
       if isinstance(payload, dict):
           # Convert dict to JSON string
           payload = json.dumps(payload)
           # specify http content-type to application/json
           task["http_request"]["headers"] = {
               "Content-type": "application/json"}
 
       # The API expects a payload of type bytes.
       converted_payload = payload.encode()
 
       # Add the payload to the request.
       task["http_request"]["body"] = converted_payload
 
   if in_seconds is not None:
       # Convert "seconds from now" into an rfc3339 datetime string.
       d = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
 
       # Create Timestamp protobuf.
       timestamp = timestamp_pb2.Timestamp()
       timestamp.FromDatetime(d)
 
       # Add the timestamp to the tasks.
       task["schedule_time"] = timestamp
 
   # Use the client to build and send the task.
   response = client.create_task(request={"parent": parent, "task": task})
 
   print("Created task {}".format(response.name))
   return 'Task Created!'

Other helpful tips

  • Although this tutorial is written for Node.js, it walks you through creating a Cloud Task queue and setting up a service account that will invoke the Function from Cloud Task. By specifying a service account for the Task, you can use an authenticated Function.
  • If you’re using a different service account to invoke the Function (rather than your Function’s identity), you need to verify that the service account has the Cloud Functions Invoker role `roles/cloudfunctions.invoker`. 
  • If you’re using a different service account for your “create_task” Function’s identity than the default, you need to verify that the service account has permissions to create Tasks. It will need the Cloud Tasks Enqueuer role `roles/cloudtasks.enqueuer`.

You can also read more about Cloud Tasks in our third blog post in this series on making outbound connections.

By: Sara Ford (Cloud Developer Advocate) and Martin Skoviera (Technical Solutions Engineer)
Source: Google Cloud Blog

Total
0
Shares
Previous Article
Eyeglass | Computer | Learning

Cloud Storage As A File System In AI Training

Next Article

Innovators Gather In Force At Google Cloud Government And Education Summit - No Need For FOMO With On-Demand Viewing

Related Posts