CBS Research Grid

The research that drives our world-class curriculum and showcases our exemplary faculty is of critical importance. From case studies to published works, researchers need access to the best tools and data that our institution can provide. Information Technology Group offers a state-of-art research server, known as the CBS Research Grid, to facilitate academic research by faculty and doctoral students at the Columbia Business School. The cluster is a system based on advanced Dell servers, EMC storage, Sun Grid Engine, and Linux. The Research Grid hosts a comprehensive set of current and historical economic/financial databases and a large collection of statistic/mathematic software programs. Covered are some of the following subjects which are aimed to assist you in getting started with the new server:

Economic and Business Databases

The Research Grid hosts a comprehensive set of current and historical financial and economic databases. COMPUSTAT and CRSP cover 30,000+ companies and include security prices and trading volume, income, and balance sheet items. I/B/E/S includes analysts’ projections for earnings, sales, and stock recommendations.  Databases also include stock market indices, bond prices, interest rates, mutual funds, stock ownership, stock options, real estate, and a wide array of macroeconomic time series. The following chart lists some of the commonly used databases:

DATA DISTRIBUTOR DATA SETS
Balance sheets, income statements, and other company-based financial items COMPUSTAT Annual and Quarterly Financial Reports, Business Segments, Price, Dividends, Earnings...
Stock prices and returns CRSP Monthly / Daily Stock Files, Indices files
Analysts Estimates I/B/E/S Daily Detailed History, Summary History
U.S. Treasury interest rates CRSP Bonds, Bills, and Inflation files
Banking industry conditions FDIC All files
U.S. Macro and International Economics Global Insight (formerly DRI) Annual, Quarterly and Monthly Economics
Options Optionmetrics Daily
Executive Compensation COMPUSTAT/Execcomp Annual

All databases on the Grid are stored and managed in SAS database format in a Linux environment.  The databases can also be accessed at the Wharton Research Data Services website (WRDS).   If you use the databases infrequently, you can retrieve data from each database on WRDS via a web interface.

WRDS website also contains resources for each database such as the users' manual and variable definitions.  Click here for an updated list of all databases that CBS subscribes, together with their detailed usage information.   You will be prompted to log in to WRDS.   If you do not have a WRDS account, you can click on the Register tab near the top of the page to request an account.  

Software Programs

The software programs on CBS Research Grid include commonly used statistical (e.g., SAS, R, and STATA), mathematical (e.g., Matlab and Mathematica), and many other software applications.  It also includes Python, C, Fortran, and Java compilers.   The following is a quick reference chart.  

Name of Software Applications Publisher Function
CPLEX CPLEX Optimization A tool for solving linear optimization problems
GAUSS Aptech system Inc. Mathematical and statistical analysis tools
MATHEMATICA Wolfram faculty Software system for math and science
MATLAB Mathworks Inc. Mathematical computation, analysis, visualization
R Open Source The software environment for statistical computing and graphics
Python Open Source Programming language useful for statistical computing, textual analysis, and web-scraping
SAS SAS Institute, Inc. Database management and statistic analysis
STATA StataCorp Data management and statistic analysis
Stat/Transfer Circle Systems A tool for converting data between software applications
C, C++, FORTRAN, Java Compilers

For detailed usage information, please see CBS Research Wiki.

CBS Research Wiki

CBS Research Wiki is an actively maintained website for the research Grid.  It contains introductions, training, usage, how-to's, and a long list of FAQs and answers.  Click here to access the page: Research Wiki   (requires VPN)

How to Get Help

The Information Technology Group is your primary source of IT support at Columbia Business School. Please contact us at [email protected] for issues related CBS Research Grid. We respond to your email within 1 business day.

Research Grid Sponsored Account

Periodically, researchers need to give their co-authors, collaborators, or external RAs access to their projects that they are working together. To request the creation or renewal of a guest user role for Research Grid access, the sponsoring faculty member may utilize the Sponsored Account Request form. Alternatively, divisional coordinators and departmental administrators may submit on their behalf. This is a pre-requisite to granting guests Research Grid access.

  • If the guest is not already a Columbia University affiliate (faculty/staff/student/alumni) the form will be submitted to CBS HR
    • HR will review and process the request
    • If approved, HR will create a UNI for Research Grid access purposes only
    • Once the UNI has been created, ITG can create the account
  • If the guest is a Columbia University affiliate, the request will be sent directly to ITG to create the account

Access can only be granted to guests for one year at a time, so a renewal form will need to be submitted annually.

Please note: This is not required if the request is solely for the purpose of a class. In that case, please contact your Program Office to have the student(s) formally registered and then contact [email protected].

Additional Storage Options

Periodically, researchers require additional storage for their research needs that exceed the standard allotted storage (100 GB).  For datasets and other non-critical working data, there is a 'scratch' area available free of charge (Note: There is no backup on this storage, and files not used in more than 1mo will be automatically deleted!)  Contingent on available capacity, we can provide additional storage, subject to an annual charge.

For extra Enterprise Storage, researchers will be charged $100 per TB per year, in 500GB increments. To request additional storage, please contact us at [email protected].