Good Spreadsheets

Rules for good spreadsheets

Why this book?

There are plenty of resources that teach you which formulas to use, how to use them, when to use “$” in your formulas etc but there very few resourses, almost none on how to organize your sheets, organize your data in a way that makes data more understandable and your sheets easier to use.

I am sure you have encounter many cases where superior, client or your personal curiosity let you a question that should be easily anwered by looking at your sheets. For example, lets imagine you work in a book shop and you want to find out what gene has been the top seller over the past year, well you have the sales data, you should be to count sales by genres and find out which one sold most and viola, you got your answer... only to find out that you have one sheet per month, so your analysis needs to be repeated 12 times (once per each month), and also the genere is sometimes included by some publishers but not others, some publishes include the genere but embeded withi other information about the book like its publing date or the number of pages.

Rule #01 “The order data is listed in your spreadsheet should not matter.” Users should be able to sort and filter data without losing information. What does this mean? Data should not have empty rows or breaks in the spreadsheet that organize data into subcategories. Columns and category fields should be added instead. If data seem incompatible, perhaps it is better to break the spreadsheet into different tabs or files.

Rule # 02 “Separate data and reports.” Data for analysis should be in one place. Summary tables, reports, graphs, etc. should be in another. Once data is consistent, reports can easily be generated via sumif functions and pivot tables.

Rule # 03 “The formatting of cells should not dictate information.” Cell fills and font colors should only help visualize data; these should not be the main data source. For example, in a spreadsheet of book orders, late orders should not be identified with only red font or a red cell fill. Instead, a new column for status should be added with the statement of “LATE” in it. The spreadsheet can be programmed to highlight or sort any rows with a “LATE” field to help with visualization.

Rule #04 “Cells should contain data, not text.” Users should be able to sort and filter any field to make sense of things. For example, if you have a cell with the following information “Received from Swanson, Inc. via email on 3/15/19”—this is wrong. This information should be contained in three different columns: “Date Received,” “Received From,” and “Received via” with “3/15/19,” “Swanson, Inc.,” and “Email” in each of these columns. If necessary, add a notes field to document additional information.

Rule #05 “Data should not be deleted.” If data is no longer valid, create a status category and add a “VOID” option, so this data is ignored in analyses.

Rule # 06 “Do not include data that is unnecessary.” In the example of Rule #04, is it necessary to document that an order was “Received from Swanson, Inc. via email on 3/15/19”? Perhaps the date the order was received and the name of the company that placed the order is crucial, but is it necessary to document that this order was received via email? It may or it may not. If it’s unnecessary information, leave this data out of your analysis tables.

Rule #07 “Blank cells should be avoided.” Blank cells should be reserved for cases where information is currently unknown. An “N/A” should be used in cases where a value is not applicable.

Rule #08 “Avoid empty sheets.” Excel’s default is a workbook with three sheets (sheet1, sheet2, sheet3). If a sheet is not used, remove it and label the sheet with a appropiate name.

Rule #09 “Always have an index value for each row in your data.” This index should be short, unique, and it should not change if you sort or add new data. The index value should not be a formula but a hard-typed value.

Rule #10 “Have intuitive and meaningful headings for columns and sheet names.” For example, the name of a column should not be “Date” but “Date Received” or “Date Received from Client.” Column headers should never be blank.

Rule #11 “Use a monospace font.” This is especially true if you are working with cost data. In monospace fonts, all letters have the same width. “Anonymous Pro” by Mark Simonson is a personal favorite.

Rule #12 “Keep cell formatting consistent.” For example, text fields indented left, numbers indented right. Use the applicable symbols to signify the data’s meaning; for example, cost data should have currency signs and percentages have the percentage signs, etc. The spreadsheet should have a style. Users should be hygienic; keep cell borders, fills, font, and font sizes consistent.

Rule #13 “Spell check should be done as often as possible.” The value of this task should not be underestemated.

Rule #14 “Numbers should have enough significant digits to make sense.” To say that an order has 34.000000 apples (to seven significant figures) is not useful; an interest rate of 0.00001237 it is. If you are working with whole numbers, use whole numbers.

Rule #15 “Pay attention to column widths.” Nobody likes a column heading or text that cannot be read because it is c ut of f.

Rule #16 “Spreadsheets should not have blank rows.” Any blank rows should be removed (see rule #1).

Rule #17 “Header and footers should be used and formatted as appropriate.” The header and footer should include sheet name, date, and page # out of the total number of pages. It is suggested the date is hard typed as to not auto-update if the spreadsheet is opened.

Rule #18 “Print preview your spreadsheets.” If/when printed, the spreadsheet should not be 20 pages long with two rows of data on each page. Pro-Tip: work in “Page Break Preview”.

Rule #19 “Learn pivot tables.” In Excel, pivot tables are a powerful tool for analysis. Understating pivot tables will save users a lot of time writing complex formulas to summarize data.

Rule #20 “Learn VBA.” With VBA in Excel, users will be able to make magic with spreadsheets.

Rule #21 “Write macros.” If a task is repetitive, write a macro—this will save time. Macros should be kept in a separate workbook from the main data.