{"guide": {"name": "creating-a-dashboard-from-bigquery-data", "category": "other-tutorials", "pretty_category": "Other Tutorials", "guide_index": null, "absolute_index": 60, "pretty_name": "Creating A Dashboard From Bigquery Data", "content": "# Creating a Real-Time Dashboard from BigQuery Data\n\n\n\n[Google BigQuery](https://cloud.google.com/bigquery) is a cloud-based service for processing very large data sets. It is a serverless and highly scalable data warehousing solution that enables users to analyze data [using SQL-like queries](https://www.oreilly.com/library/view/google-bigquery-the/9781492044451/ch01.html).\n\nIn this tutorial, we will show you how to query a BigQuery dataset in Python and display the data in a dashboard that updates in real time using `gradio`. The dashboard will look like this:\n\n\n\nWe'll cover the following steps in this Guide:\n\n1. Setting up your BigQuery credentials\n2. Using the BigQuery client\n3. Building the real-time dashboard (in just _7 lines of Python_)\n\nWe'll be working with the [New York Times' COVID dataset](https://www.nytimes.com/interactive/2021/us/covid-cases.html) that is available as a public dataset on BigQuery. The dataset, named `covid19_nyt.us_counties` contains the latest information about the number of confirmed cases and deaths from COVID across US counties.\n\n**Prerequisites**: This Guide uses [Gradio Blocks](/guides/quickstart/#blocks-more-flexibility-and-control), so make your are familiar with the Blocks class.\n\n## Setting up your BigQuery Credentials\n\nTo use Gradio with BigQuery, you will need to obtain your BigQuery credentials and use them with the [BigQuery Python client](https://pypi.org/project/google-cloud-bigquery/). If you already have BigQuery credentials (as a `.json` file), you can skip this section. If not, you can do this for free in just a couple of minutes.\n\n1. First, log in to your Google Cloud account and go to the Google Cloud Console (https://console.cloud.google.com/)\n\n2. In the Cloud Console, click on the hamburger menu in the top-left corner and select \"APIs & Services\" from the menu. If you do not have an existing project, you will need to create one.\n\n3. Then, click the \"+ Enabled APIs & services\" button, which allows you to enable specific services for your project. Search for \"BigQuery API\", click on it, and click the \"Enable\" button. If you see the \"Manage\" button, then the BigQuery is already enabled, and you're all set.\n\n4. In the APIs & Services menu, click on the \"Credentials\" tab and then click on the \"Create credentials\" button.\n\n5. In the \"Create credentials\" dialog, select \"Service account key\" as the type of credentials to create, and give it a name. Also grant the service account permissions by giving it a role such as \"BigQuery User\", which will allow you to run queries.\n\n6. After selecting the service account, select the \"JSON\" key type and then click on the \"Create\" button. This will download the JSON key file containing your credentials to your computer. It will look something like this:\n\n```json\n{\n\t\"type\": \"service_account\",\n\t\"project_id\": \"your project\",\n\t\"private_key_id\": \"your private key id\",\n\t\"private_key\": \"private key\",\n\t\"client_email\": \"email\",\n\t\"client_id\": \"client id\",\n\t\"auth_uri\": \"https://accounts.google.com/o/oauth2/auth\",\n\t\"token_uri\": \"https://accounts.google.com/o/oauth2/token\",\n\t\"auth_provider_x509_cert_url\": \"https://www.googleapis.com/oauth2/v1/certs\",\n\t\"client_x509_cert_url\": \"https://www.googleapis.com/robot/v1/metadata/x509/email_id\"\n}\n```\n\n## Using the BigQuery Client\n\nOnce you have the credentials, you will need to use the BigQuery Python client to authenticate using your credentials. To do this, you will need to install the BigQuery Python client by running the following command in the terminal:\n\n```bash\npip install google-cloud-bigquery[pandas]\n```\n\nYou'll notice that we've installed the pandas add-on, which will be helpful for processing the BigQuery dataset as a pandas dataframe. Once the client is installed, you can authenticate using your credentials by running the following code:\n\n```py\nfrom google.cloud import bigquery\n\nclient = bigquery.Client.from_service_account_json(\"path/to/key.json\")\n```\n\nWith your credentials authenticated, you can now use the BigQuery Python client to interact with your BigQuery datasets.\n\nHere is an example of a function which queries the `covid19_nyt.us_counties` dataset in BigQuery to show the top 20 counties with the most confirmed cases as of the current day:\n\n```py\nimport numpy as np\n\nQUERY = (\n 'SELECT * FROM `bigquery-public-data.covid19_nyt.us_counties` '\n 'ORDER BY date DESC,confirmed_cases DESC '\n 'LIMIT 20')\n\ndef run_query():\n query_job = client.query(QUERY)\n query_result = query_job.result()\n df = query_result.to_dataframe()\n # Select a subset of columns\n df = df[[\"confirmed_cases\", \"deaths\", \"county\", \"state_name\"]]\n # Convert numeric columns to standard numpy types\n df = df.astype({\"deaths\": np.int64, \"confirmed_cases\": np.int64})\n return df\n```\n\n## Building the Real-Time Dashboard\n\nOnce you have a function to query the data, you can use the `gr.DataFrame` component from the Gradio library to display the results in a tabular format. This is a useful way to inspect the data and make sure that it has been queried correctly.\n\nHere is an example of how to use the `gr.DataFrame` component to display the results. By passing in the `run_query` function to `gr.DataFrame`, we instruct Gradio to run the function as soon as the page loads and show the results. In addition, you also pass in the keyword `every` to tell the dashboard to refresh every hour (60\\*60 seconds).\n\n```py\nimport gradio as gr\n\nwith gr.Blocks() as demo:\n gr.DataFrame(run_query, every=gr.Timer(60*60))\n\ndemo.launch()\n```\n\nPerhaps you'd like to add a visualization to our dashboard. You can use the `gr.ScatterPlot()` component to visualize the data in a scatter plot. This allows you to see the relationship between different variables such as case count and case deaths in the dataset and can be useful for exploring the data and gaining insights. Again, we can do this in real-time\nby passing in the `every` parameter.\n\nHere is a complete example showing how to use the `gr.ScatterPlot` to visualize in addition to displaying data with the `gr.DataFrame`\n\n```py\nimport gradio as gr\n\nwith gr.Blocks() as demo:\n gr.Markdown(\"# \ud83d\udc89 Covid Dashboard (Updated Hourly)\")\n with gr.Row():\n gr.DataFrame(run_query, every=gr.Timer(60*60))\n gr.ScatterPlot(run_query, every=gr.Timer(60*60), x=\"confirmed_cases\",\n y=\"deaths\", tooltip=\"county\", width=500, height=500)\n\ndemo.queue().launch() # Run the demo with queuing enabled\n```\n", "tags": ["TABULAR", "DASHBOARD", "PLOTS"], "spaces": [], "url": "/guides/creating-a-dashboard-from-bigquery-data/", "contributor": null}}