Spreadsheet software, such as Microsoft Excel and Google Sheets, is a business necessity. It is a staple of planning, budgeting, and analysis, suitable for business areas where technological maturity is still developing. The latter is evident in emerging business areas or when exploring novel data acquisition methods. In these instances, spreadsheets are employed as a database and a pseudo data-capturing platform. Among a plethora of positives, using spreadsheets can be limiting. The number of rows allowed is often a concern for larger data sets, while calculation errors may unintentionally occur.
In this blog post, I explain why Google Sheets is a great tool to use with Python and SQL. Google Sheets is a user input form, while Python and SQL are used for data storing and processing. To illustrate the power of this combination, I allude to a recent project I completed for a client. Budgeting is a unique process for everyone, and the contents of a project like this will vary depending on the specific application. This post focuses on the technical aspects of the project, rather than the specific use case.
The scope—in short
The client had two distinct enterprise resource planning (ERP) systems running on their premises—one with a MySQL backend and the other with an MS SQL backend. They needed to create a budget using data from various systems, including data that had not yet been systematised. The systematised data included historical trends and dimension fields, such as product and client metadata. The unsystematised data, such as data unavailable in any system, supported the budget. This included client demand for the subsequent year, predicted product prices, and the cost structures used in exporting goods (fresh produce).
The budget had to be flexible enough to adapt to changes in market demand and supply availability. It was, therefore, clear that data needed to be centralised in a single ‘data mart’. Apart from the data sources, the user required near real-time feedback from the budgeting process, such as when a dimension field is missing a value, or a price field is missing; this should be emphasised to the user in the input fields.
Figure 1 presents the components contributing to a fresh produce budget. Each element is dynamic and constantly updated as the harvest or demand changes. A pre-season budget will experience significant changes during the harvesting period and will appear vastly different post-season. Your use case will undoubtedly differ, but I shall use the concepts from this project to describe the technology application of using Google Sheets with ERP system-generated data.

The workflow
The clients, being trusted Excel users, were familiar with and comfortable using a spreadsheet. Owing to a need for expediency, they preferred a spreadsheet over writing a front-end input platform. I have often used MS Excel on SharePoint; however, I found a snag in writing back into an Excel Online file while a user was busy on the sheet. Google Sheets overcame this through their well-documented and easy-to-use Application Programming Interface (API). Python could, therefore, be used to read from and write back into the same sheet without having to write over the file—a quick and effortless way to prototype the budgeting process.
Figure 2 demonstrates the data flows from source systems to the resulting reports. Historical trends, dimension fields, and realised (actual product sold) data were extracted from the on-premises ERP systems. Estimated pricing and costs, customer demand, and allocation preferences are managed in Google Sheets. All data were staged in a SQL database (I used MySQL as it was already established for another project) and transformed into a well-structured, normalised database. From here, calculations and algorithms could be run to provide feedback to the input fields through the Google Sheets API. I used Power BI to connect to the data model to provide a cash flow statement, currency reserve estimates, and provider payouts. The Power BI data model is automatically refreshed through the Power BI API whenever the fundamental data are updated in the data model.
The budget had to be flexible enough to adapt to changes in market demand and supply availability. It was, therefore, clear that data needed to be centralised in a single ‘data mart’. Apart from the data sources, the user required near real-time feedback from the budgeting process, such as when a dimension field is missing a value, or a price field is missing; this should be emphasised to the user in the input fields.
Figure 1 presents the components contributing to a fresh produce budget. Each element is dynamic and constantly updated as the harvest or demand changes. A pre-season budget will experience significant changes during the harvesting period and will appear vastly different post-season. Your use case will undoubtedly differ, but I shall use the concepts from this project to describe the technology application of using Google Sheets with ERP system-generated data.

Manage the workflow
When data are transferred from one location to another, the frequency of data updates is a top priority. For this project, not all data had to be ‘fresh’ but needed to be ‘fresh’ on demand. Data in Google Sheets are updated more frequently than data from on-premises sources, which may only change once a day or even weekly. Certain data, such as the previous day’s actualised data, can be refreshed on a schedule. To orchestrate the refresh of various parts of the process, I created a user configuration file (Figure 3). Here, the user can control which data sets are updated when ‘running’ the process.

Each row in the configuration file corresponds to a function in the Python module. Using crontab on a Linux Ubuntu server, I set a cron job to run every 10 seconds. The cron job verifies whether any function has been activated (Column B turned to True). The function, when activated, initiates the relevant processes that update or calculate the data model.
In Figure 4, the data updates in Google Sheets are triggered by the parent function called ‘synch_google_tables’. The ‘synch_google_tables’ function actuates the corresponding process for each data set.

The outcome
You do not need to be a full-stack developer, but some technical and programming skills are required to set up the process. The most challenging part was setting up the data model and deciding on the origin of each final piece of the data puzzle. Subject matter understanding is more important than technical ability.
Budgeting should be a dynamic process. Businesses require an updated observation of budget pitfalls or potential wins, especially in a fast-moving environment, such as fresh produce. Multiple opportunities exist for optimising income and cost structures—which can be exploited only with a more realistic budget forecast. A spreadsheet environment is conducive to “what-if” scenario planning, although more complicated calculations should be programmed into the Python or SQL code.
Tips and considerations
I conclude with a few thoughts should you embark on a similar journey:
- Be strict on database setup. I used Django—a Python object-relational mapper (ORM), to create tables, relationships, and table restrictions. A pro for using spreadsheets is that it is flexible. A con for using spreadsheets is that it is flexible. This flexibility can easily lead to duplication in data. To prevent this, stick to data keys and no-duplication index rules.
- Use a web service if possible. If not for the on-premises data, I would recommend that this type of project be hosted on a web service. Cron tabs are easy to write but a challenge to maintain. Error handling is crucial. I used “try and except” clauses in the code with Pythons’ built-in logging. Where a significant error occurs, it would send a notification to MS Teams through a Webhook. I also built in feedback, constructing a message back to the Config file to give the user an indication of what was wrong (e.g., duplicate index notifications, etc).
- Have a development environment. Staunch Spreadsheet users might initially be frustrated by maintaining two or three environments (development, quality assurance and production). A business might exert pressure to start using features still in development since this project is small, comprising only one developer. If not thoroughly evaluated, the flexibility of the spreadsheet environment can lead to data integrity issues. Splitting development and production environments allows for changes in a structured and tested way. I also strongly recommend locking down the spreadsheet as much as possible.
- Highlight missing dimension fields in user input sheets. To manage dimension fields in the input sheets, I ensured all fields had the latest dimension field values hidden in the workbook. On every update of the Sheets module, fresh dimension fields are written into all workbooks containing input forms. This emphasises when a dimension field has not yet been registered in the database and prompts the user to add it to the source system or a dimension table in a sheet.
In summary, businesses often need to discover a process before formalising it. Google Sheets can be used to prototype innovative ideas and function as a segue between business needs and software development. The process allows for flexibility within database rigidity. By using Google Sheets, Python, and SQL, analysts can leverage the strengths of each component instead of relying on spreadsheets as a data capturing, storage, and analytical layer system.
This was such a fun project! I particularly enjoyed experimenting with the powerful Google API and connecting with the vibrant Google API developer community.
Budgeting should be a dynamic process. Businesses require an updated observation of budget pitfalls or potential wins, especially in a fast-moving environment, such as fresh produce. Multiple opportunities exist for optimising income and cost structures—which can be exploited only with a more realistic budget forecast. A spreadsheet environment is conducive to “what-if” scenario planning, although more complicated calculations should be programmed into the Python or SQL code.