The United States will soon be able to analyze data from millions of individuals. Congress has mandated that the U.S. Food and Drug Administration develop a postmarket risk identification and analysis system that covers 100 million persons (2). In addition, the expansion of comparative effectiveness research envisioned by Congress requires access to health care information for large, diverse populations in real-world settings (3). Large, centralized data repositories could support these functions, but we and others (4 - 5) believe that a distributed health data network has many practical advantages. First, a distributed network allows data holders to retain physical and logical control of their data. Second, it mitigates many security, proprietary, legal, and privacy concerns, including those regulated by the Privacy and Security Rules of the Health Insurance Portability and Accountability Act (6). Third, it eliminates the need to create, maintain, and secure access to central data repositories. Fourth, it minimizes the need to disclose protected health information outside the data-owning entity. Finally, a distributed network allows data holders to assess, track, and authorize requests for all data uses.