2421    Overview of the Sequence Rules [R-08.2012]

2421.01    Definition of "Sequence Listing" and "CRF" [R-07.2015]

The sequence rules (37 CFR 1.821  -1.825 ) require the use of standard symbols and a standard format for submitting sequence data in most patent applications that disclose nucleic acid or amino acid sequences. For purposes of the sequence rules and the discussion in MPEP Chapter 2400, the phrase "disclose(d) (or disclosure(s) of) nucleic acid or amino acid sequences" is intended to refer to those nucleic acid or amino acid sequences that are described in the patent application by enumeration of their residues and that meet the length thresholds of 37 CFR 1.821(a).

37 CFR 1.821(c)  requires that applications containing disclosures of nucleotide and/or amino acid sequences that fall within the definitions of 37 CFR 1.821(a)  contain, as a separate part, a disclosure of the nucleotide and/or amino acid sequences, and associated information, using the format and symbols that are set forth in 37 CFR 1.822  and 37 CFR 1.823. This separate part of the disclosure is referred to as the "Sequence Listing" (hereinafter alternatively referred to as "sequence listing"). The sequence listing required pursuant to 37 CFR 1.821(c)  is the official copy of the sequence listing, and may be submitted as an ASCII text file via EFS-Web, on compact disc, as a PDF submitted via EFS-Web, or on paper. See MPEP § 2422.03 for additional information.

37 CFR 1.821(e)  requires that a copy of the sequence listing referred to in 37 CFR 1.821(c)  must also be submitted in computer readable form (CRF) as an ASCII text file in accordance with the requirements of 37 CFR 1.824  (hereinafter "CRF of the sequence listing" or "CRF"). The computer readable form may be submitted on the electronic media permitted by 37 CFR 1.824, or may be submitted as an ASCII text file via EFS-Web. See MPEP § 2422.04 for additional information.

2421.02    Summary of the Requirements of the Sequence Rules [R-07.2015]

The sequence rules define a set of symbols and procedures that are both mandatory and the only way that an applicant is permitted to describe information in the sequence listing about a sequence that falls within the definitions used in the rules. Thus, 37 CFR 1.821  defines a "sequence" and a sequence listing for the purpose of the rules, the requirements for specific symbols, and formats for the sequence listing, the requirement for a computer readable form (CRF) of the sequence listing and the deadlines for complying with the requirements. 37 CFR 1.822  to 37 CFR 1.824  set forth detailed descriptions of the requirements that are mandatory for the presentation of sequence data, and 37 CFR 1.825  sets forth procedures that are available to an applicant in the event that amendments to the sequence listing or replacement of the CRF of the sequence listing become necessary.

The sequence rules embrace all unbranched nucleotide sequences with ten or more nucleotide bases and all unbranched, non-D amino acid sequences with four or more amino acids, provided that there are at least 10 "specifically defined" nucleotides or 4 "specifically defined" or amino acids. The rules apply to all sequences in a given application, whether claimed or not. All such sequences are relevant for the purposes of building a comprehensive database and properly assessing prior art. It is therefore essential that all sequences, whether only disclosed or also claimed, be included in the database.

2421.03    Notification of a Failure to Comply [R-07.2015]

With respect to the Office’s determination of compliance with the sequence rules and the opportunities afforded applicants to satisfy the requirements of the rules, applicants will be notified of easily detectable deficiencies early in the application process. Applicants whose computer readable forms are not readable, or are missing mandatory elements, will be notified shortly after receipt of the application by the Office. Applications filed on or after November 29, 2000, will be retained in the Office of Patent Application Processing (OPAP) until any noncompliant sequence listing that renders an application unsuitable for examination is corrected.

OPAP will mail a Notice to Comply With Requirements For Patent Applications Containing Nucleotide Sequence And/Or Amino Acid Sequence Disclosures to applicant listing the requirements that have not been met and setting a two month time period within which to comply with the sequence rules, 37 CFR 1.821  - 1.825. Failure to comply with these requirements will result in abandonment of the application under 37 CFR 1.821(g). Extension of time may be obtained by filing a petition accompanied by the extension fee under the provisions of 37 CFR 1.136.

Patent applications filed under 35 U.S.C. 111  on or after December 18, 2013, and international patent applications in which the national stage commenced under 35 U.S.C. 371  on or after December 18, 2013, may be subject to reductions in patent terms adjustment pursuant to 37 CFR 1.704(c)(13)  if they are not in condition for examination within eight months from the filing date or date of commencement, respectively. "In condition for examination" includes compliance with 37 CFR 1.821  -1.825  (see 37 CFR 1.704(f) ). Deficiencies of a more sophisticated nature will likely only be detected by the examiner to whom the application is assigned. Applicant will be notified of any errors or inconsistencies detected by the examiner in the next Office action.

A notification of a failure to comply with the sequence rules will be accompanied by an analysis of any submitted computer readable form. Any inquiries regarding a specific computer readable form that has been processed by the Office should be directed to the Sequence Systems Service Center of the Scientific and Technical Information Center at 571-272-2510 or via email at STIC-SSSCHelpdesk@uspto.gov.

2421.04    Future Changes to the Sequence Rules [R-07.2015]

With general regard to the symbols and format to be used for nucleotide and/or amino acid sequence data set forth in 37 CFR 1.822  and the form and format for sequence submissions in computer readable form set forth in 37 CFR 1.824, the Office intends to accommodate progress in the areas of both standardization and computerization as they relate to sequence data by subsequently amending the rules to take into account any such progress. As the Office progresses in these areas, the Office will do so by the publication of notices in the Official Gazette or formal rulemaking proposals, as appropriate. The Office will also continue work on the preparation of a new World Intellectual Property Organization (WIPO) standard on the presentation of nucleotide and amino acid sequence listings using eXtensible Markup Language (XML) with the members of the Task Force on Sequence Listings created by the Committee on WIPO Standards. See Request for Comments on the Recommendation for the Disclosure of Sequence Listings Using XML (Proposed ST.26), 77 Fed. Reg. 28541 (May 15, 2012), for additional information.