Welcome!
Thank you for your interest in contributing to the DIBBs Query Connector project, part of the CDC's Office of Public Health Data, Surveillance, and Technology (OPDHST) Data Modernization Initiative. This is a multi-year effort to improve data quality and information technology systems in state and local health departments. There are many ways to contribute, including writing tutorials or blog posts, improving the documentation, submitting bug reports and feature requests, and writing code for the project itself.
Before contributing, we encourage you to also read our LICENSE, README, and code of conduct files, also found in this repository. If you have any questions that aren't answered in this repository, feel free to contact us.
Public domain
This project is in the public domain within the United States, and we waive copyright and related rights in the work worldwide through the CC0 1.0 Universal public domain dedication. All contributions to this project will be released under the CC0 dedication. By submitting a pull request or issue, you are agreeing to comply with this waiver of copyright interest and acknowledge that you have no expectation of payment, unless pursuant to an existing contract or agreement.
Bug reports
If you think you found a bug in DIBBs Query Connector, search our issues list in case someone has opened a similar issue. We'll close duplicate issues to help keep our backlog neat and tidy.
It's very helpful if you can prepare a reproduction of the bug. In other words, provide a small test case that we can run to confirm your bug. This makes it easier to find and fix the problem.
Feature requests
If you find yourself wishing for a new feature on Query Connector, you're probably not alone.
Open an issue on our issues list that describes the feature you'd like to see, why you want it, and how it should work. Please try to be as descriptive as possible. We can't guarantee that we'll be able to develop a requested feature, but we'll do our best to give it the care and consideration it deserves.
Contributing code and documentation changes
If you want to contribute a new feature or a bug fix, we ask that you follow a few guidelines. We have a dedicated team of engineers and support staff that works on this project on a daily basis, and following these steps ensures new pieces are interoperable, with no duplicated effort.
- Before you get started, look for an issue you want to work on. Please make sure this issue is not currently assigned to someone else. If there's no existing issue for your idea, please open one. We use the
good first issue
label to mark the problems we think are suitable for new contributors, so that's a good place to start. - Once you select an issue, please discuss your idea for the fix or implementation in the comments for that issue. It's possible that someone else may have a plan for the issue that the team has discussed, or that there's context you should know before starting implementation. There are often several ways to fix a problem, and it's essential to find the right approach before spending time on a pull request (PR) that can't be merged.
- Once we've discussed your approach and everything looks good, we'll give you formal approval to begin development. Please don't start your development work before you get approval. We don't want you to waste your time on an issue if someone is already hard at work on it, or if there's something else in the way!
Fork and clone the repository
You need to fork the main code or documentation repository and clone it to your local machine. See github help page for help.
Create a branch for your work.
If you need to base your work on someone else's branch, talk to the branch owner.
Coding your changes
As you code, please make sure you add proper comments, documentation, and unit tests. DIBBs adheres to strict quality guidelines, regardless of whether a contributing engineer is on the team or external. Please also ensure that you thoroughly test your changes on your local machine before submitting.
Commenting guidelines
Commenting is an important part of writing code other people can easily understand (including your future self!). While each developer has a unique style when it comes to writing comments, we have a few guidelines that govern the project, outlined below.
Exported functions need to document their input signatures with JSDoc comments. We have a linting step to check this whenever you push changes to the repo, so make sure you've added those comments for any exported functions.
Line-level comments
Commenting brings the need to balance succinctness with verbosity, and clarity with maintainability. Sometimes striking this balance is more of an art than a science. In general, contributors should feel empowered to strike this balance as they see fit. However, in the interest of quality and consistency, we strive for a few principles when adding line-level comments:
- Self-documenting code is best: Use clear, descriptive, and precise names for modules, variables, and functions; write code that speaks for itself.
- Consider refactoring: Modularizing and other restructuring can lead to self-documenting code. Clear, cleanly structured code is better than adding comments to explain poorly written or confusing code.
- Comments should add clarity: Good comments can add information that can't be easily read or inferred, summarize functionality provided by a block of code, or add context to explain non-idiomatic code. Avoid restating information that's already in function-level comments, or can be easily obtained from within the code itself.
How to set up your local environment to contribute to our code
Hardware
Supported hardware for PHDI tools includes:
- Linux
- Windows 10/11
- Mac (Intel or Apple silicone)
Code coverage guidelines
We recognize there's a lot of human judgment when it comes to writing test cases. It's impossible to plan for every scenario in which a function might be used or think of every error that an input data type might cause. To ensure the repository is as robust as reasonably possible, we've developed the following principles around writing tests and code coverage.
-
We strive for 85% repository coverage at all times. There's no magic number when it comes to assessing coverage quality. A high coverage percentage doesn't actually guarantee high-quality testing, and focusing too much on getting a coverage number as close to 100% as possible could lead to a false sense of security as well as wasted time and effort. At the same time, it's important we ensure that our tools are tested in a robust way. A coverage statistic of 85% strikes a balance between fixating on the coverage percent and keeping tests robust and accountable.
-
Any public-facing functions should include at least one test case. If a user can call function, we want that function to have documented test coverage. Ideally, functions should have multiple tests (see below), but at a minimum, any public function should have at least one. For private or helper functions, discretion is given to the developer, but the principles below might suggest test cases for those, too.
-
Broadly speaking, tests should address, at a minimum, three areas: the most-likely scenarios, the most complex scenarios, and the most painful failure scenarios. It's not practical to write tests for every permutation of inputs and results, but we can anticipate the types of use cases that are most likely to come up. Ideally, a function should have one or more tests that demonstrate success and/or failure in its most common use-case(s). Additionally, functions that have complex interactions—whether with their parameters, internal logic, dependencies, etc.—should have tests addressing the robustness of that logic in the face of multiple settings or interactions (for example, a function which accepts 6 configurable parameters might have tests of that function with 4 or even 5 permutations of parameter settings). Finally, if a piece of code could generate a particularly painful result for a user if it were to break, that code should have tests in place around how those issues are handled.
The points above are our goals when it comes to testing and code coverage; they're principles we're always striving to uphold. Below, we've listed some of the more practical, actionable kinds of tests that stem from these principles. We believe that specific tests should exist to cover, as much as possible, the following cases:
- Sucess and failure cases: Functions should have one or more tests showing that the function performs its intended job successfully, as well as one or more tests showing how the function handles a failure case or a case where processing isn't possible.
- All input and output data types: Functions should have one or more tests showing, for each possible input or output data type, how the function performs when it receives or creates that data type.
- Explicitly raised exceptions: If an exception is raised in the body code of a function, even if it's a conditional exception, one or more tests should exist showing that exception being raised.
- Edge cases (within reason): If a function might deal with edge cases in its processing (an empty list or dictionary parameter, for example), one or more tests should exist showing how the function behaves under those conditions.
We recognize that overlap exists between these categories—for example, a test might validate both a failure case, an edge case, and an exception case by providing an empty dictionary as an input parameter. These points are intended as guidelines for the types of tests that should be written holistically to assess a function's robustness. How these principles are put into practical tests is up to the individual developer.
Submitting your changes
Once your changes and tests are ready to submit for review:
-
Test your changes
Run the full local test suite to make sure that nothing is broken.
-
Rebase your changes
Update your local repository with the most recent code from the principal repository, and rebase your branch on top of the latest
main
branch. We prefer your initial changes to be squashed into a single commit. Later, if we ask you to make changes, add the changes as separate commits. This makes the changes easier to review. -
Submit a PR
Push your local changes to your forked copy of the repository and submit a PR. In the PR, please follow the template provided.
All PRs must have at least one approval. Please tag several engineers for review; you can see which engineers are most active by checking the repository's commit history.
Note that you shouldn't squash your commit at the end of the review process. Instead, you should do it when the pull request is integrated via GitHub.
Reviewer responsibilities
A good code review considers many aspects of the PR:
- Will the code work, and will it work in the long term?
- Is the code understandable and readable? Is documentation needed?
- What are the security implications of the PR?
- Does the PR contain sensitive information like secret keys or passwords? Can these be logged?
Submitters should help reviewers by calling out how these considerations are met in comments on the review.
Credit
This document is a mashup of contributor documents from the CDC, ElasticSearch, SimpleReport, and ReportStream.