WormBook Header Image Embryo series courtesy of Einhard Schierenberg
Cite WormBook
for Authors
or Search with Textpresso

The neuronal genome of Caenorhabditis elegans*

Oliver Hobert§
Columbia University Medical Center, HHMI, New York, NY, USA;

Table of Contents

1. Introduction
2. Ion channels
2.1. Potassium channels
2.2. Calcium channels, transporters and calcium binding proteins
2.3. TRP channels
2.4. Cyclic nucleotide-gated ion channels
2.5. Ligand-gated ion channels
2.6. Ionotropic glutamate receptors
2.7. DEG/ENaC/ASIC channels
2.8. Chloride channels and chloride transporters
2.9. New ion channels
2.10. Summary of absent ion channels
3. Neurotransmitter pathways
3.1. Neurotransmitter synthesis
3.2. Vesicular transport of neurotransmitters
3.3. Neurotransmitter reuptake
3.4. Neurotransmitter degradation
3.5. The case for and against other neurotransmitter systems
4. Neuropeptides
4.1. Neuropeptide-encoding genes
4.2. Biosynthesis and processing of neuropeptides
4.3. Neuropeptide receptors
5. G-protein coupled receptors (GPCRs)
5.1. Metabotropic neurotransmitter receptors
5.2. Neuropeptide receptors
5.3. Sensory and orphan receptors
5.4. Adhesion GPCRs
5.5. Frizzled/Taste2 GPCRs
5.6. Downstream of GPCRs
6. Cyclic GMP
6.1. Guanylyl cyclases
6.2. Phosphodiesterases
7. Receptors for CO2 and O2
8. Presynaptic machinery
9. Neurotransmitter receptor localization: PDZ proteins
10. Gap junctions - the innexins
11. Motor proteins & their associated complexes
11.1. Kinesin, dynein and myosin motors
11.2. Motor complexes that build cilia of sensory neurons
12. Neuronal recognition and adhesion molecules
12.1. Immunoglobulin superfamily
12.2. Leucine-Rich Repeat (LRR) proteins
12.3. Cadherins
12.4. Neurexin and its ligands
13. Conclusions
14. Tables 2-35
Table 2: Potassium channels (72 genes)
Table 3: Candidate auxiliary subunits for various types of ion channels (93 genes)
Table 4: Voltage-gated calcium channels (9 genes)
Table 5: SLC transporters with confirmed or putative neuronal functions (82 genes)
Table 6: Calcium binding proteins – the “EF hand-only” proteins (65 genes)
Table 7: TRP channels (23 genes)
Table 8: Cyclic nucleotide gated channels (6 genes)
Table 9: nAChR-type ligand-gated ion channels of the Cys-loop LGIC superfamily (61 genes)
Table 10: Other ligand-gated ion channels of the Cys-loop LGIC superfamily (41 genes)
Table 11: Ionotropic glutamate receptors (15 genes)
Table 12: DEG/ENaC channels (32 genes)
Table 13: Chloride channels (35 genes)
Table 14: Neurotransmitter synthesis and degradation (36 genes)
Table 15: Neuropeptide-encoding genes (122 genes)
Table 16: Metabolism of neuropeptides (47 genes)
Table 17: Insulin/EGF receptor-like proteins (70 genes)
Table 18: The five classes of GPCRs
Table 19: Metabotropic neurotransmitter receptors (27 genes)
Table 20: GPCR-type putative neuropeptide receptors and their grouping into families (153 genes)
Table 21: Representative analysis of srw genes reveals their relation to neuropeptide receptors
Table 22: Adhesion-type GPCRs (5 genes)
Table 23: Downstream of GPCRs (83 genes)
Table 24: Making and breaking cGMP - Guanylyl cyclases and phosphodiesterases (40 genes)
Table 25: Receptors of CO2 and O2 (39 genes plus 7 soluble GCY genes)
Table 26: Synaptic vesicle proteins and their homologs (57 genes)
Table 27: PDZ domain proteins (70 genes)
Table 28: Gap junction proteins – the innexins (25 genes)
Table 29: Kinesin-like Motor Proteins (21 genes)
Table 30: Dynein motors (17 genes)
Table 31: Myosin Motors (18 genes)
Table 32: Sensory cilia transport (35 genes)
Table 33: Extracellular Immunoglobulin (Ig) and Leucine rich repeat (LRR) domain-containing proteins (93 genes)
Table 34: Cadherins (13 genes)
Table 35: Neurexin superfamily and neurexin ligands (8 genes)
15. Acknowledgements


The ~100 MB genome of C. elegans codes for ~20,000 protein-coding genes many of which are required for the function of the nervous system, composed of 302 neurons in the adult hermaphrodite and of 383 neurons in the adult male. In addition to housekeeping genes, a differentiated neuron is thought to express many hundreds if not thousands of genes that define its functional properties. These genes code for ion channels, G-protein-coupled receptors, neurotransmitter-synthesizing enzymes, transporters and receptors, neuropeptides and their receptors, cell adhesion molecules, motor proteins, signaling molecules and many others. Collectively such genes have been referred to as “terminal differentiation genes” or “effector genes”. The differential expression of distinct combinations of terminal differentiation genes define different neuron types. This paper provides a compendium of more than 2,800 putative terminal differentiation genes. One pervasive theme revealed by the analysis of many gene families is the nematode-specific expansions of many neuron function-related gene families, including, for example, many types of ion channel families, sensory receptors and neurotransmitter receptors. The gene lists provided here can serve multiple purposes. They can serve as quick reference guides for individual gene families or they can be used to mine large datasets (e.g., expression datasets) for genes with likely functions in the nervous system. They also serve as a starting point for future projects. For example, a comprehensive understanding of the regulation of the often complex expression patterns of these genes in the nervous system will eventually explain how nervous systems are built.

1. Introduction

Neurons are information processing devices that receive, integrate and transmit signals to induce specific patterns of behavior. Among the key defining features of a mature neuron are its specific position, morphology and physical connections (in the form of electrical and chemical synapses), its electrophysiological properties (i.e., resting potential of the cellular membrane), and the molecular means by which it receives, propagates and transmits chemical signals, either locally across synapses or over longer distances in a paracrine manner. These basic features are defined by the expression of “nuts and bolts” genes that have demonstrated or predicted functions in terminally differentiating or mature neurons (Table 1). Such genes have been referred to as “terminal differentiation genes” or “effector genes” (see Neurogenesis in the nematode Caenorhabditis elegans) and are the focus of this review. These gene families are listed in the overview Table 1 and include ~2,800 genes. Differences in the identity and function of individual neuron types can presumably be ascribed to the differential expression of specific members of these gene families.

Table 1: Summary of genes discussed in each chapter. As mentioned in the text, molecules listed in specific categories in this Table are often no more than mere candidates for being involved in the indicated function.

Section Gene family Number of genes Table
1. Introduction    
2. Ion channels    
2.1 Potassium channels    
2.1.1 Channel types 72 Table 2
2.1.2 Auxiliary subunits 53 Table 3
2.2 Calcium channels, transporters and binding proteins    
2.2.1 Voltage gated calcium channels and auxiliary subunits 11 Table 4, Table 3
2.2.2 Other calcium channels 3  
2.2.3 Calcium transporter 14 Table 5
2.2.4 Calcium binding proteins 65 Table 6
2.3 TRP channels 23 Table 7
2.4 Cyclic nucleotide-gated ion channels 6 Table 8
2.5 Ligand-gated ion channels (LGICs)    
2.5.1 nAChR-type ligand-gated ion channels of the Cys-loop LGIC superfamily 61 Table 9
2.5.2 Other ligand-gated ion channels of the Cys-loop LGIC superfamily (GABA-, Glutamate-gated and others) 41 Table 10
2.5.3 Auxiliary subunits of the Cys-loop LGIC superfamily 20 Table 3
2.6 Ionotropic glutamate receptors    
2.6.1 Channel types 15 Table 11
2.6.2 Auxiliary subunits 8 Table 3
2.7 DEG/ENaC/ASIC channels    
2.7.1 Channel types 32 Table 12
2.7.2 Auxiliary subunits 10 Table 3
2.8 Chloride channels and chloride transporters    
2.8.1 Chloride channels 35 Table 13
2.8.2 Chloride transporters 11 Table 5
2.9 New ion channels 1  
2.10 Summary of absent ion channels    
3. Neurotransmitter pathways    
3.1 Neurotransmitter synthesis 24 Table 14
3.2 Vesicular transport of neurotransmitters 17 (+12) Table 5
3.3 Neurotransmitter reuptake 32 Table 5
3.4 Neurotransmitter degradation 12 Table 14
3.5 The case for and against other neurotransmitter systems    
4. Neuropeptides    
4.1 Neuropeptide-encoding genes 122 Table 15
4.2 Biosynthesis and processing of neuropeptides 47 Table 16
4.3 Neuropeptide receptors: beyond the GPCRs 70 (+ GPCR) Table 17
5. G-protein coupled receptors (GPCRs)   Table 18
5.1 Metabotropic neurotransmitter receptors 29 Table 19
5.2 GPCR-type neuropeptide receptors (+ additional candidates) 153 (+100) Table 20, (+Table 21)
5.3 Sensory and orphan GPCRs ∼1,280  
5.4 Adhesion GPCRs 5 Table 22
5.5 Frizzled/Taste2 GPCRs 4 Table 18
5.6 Downstream of GPCRs 83 Table 23
6. cGMP    
6.1 Guanylyl cyclases 34 Table 24
6.2 Phosphodiesterase 6 Table 24
7. Receptors of CO2 and O2 39 Table 25
8. Presynaptic machinery 57 Table 26
9. Neurotransmitter receptor localization: PDZ proteins 70 Table 27
10. Gap junctions 25 Table 28
11. Motor proteins & their associated complexes    
11.1 Kinesin, dynein and myosin motors 56 Table 2931
11.2 Motor complexes that build cilia of sensory neurons 35 Table 32
12. Neuronal recognition and adhesion molecules    
12.1 Immunoglobulin superfamily 64 Table 33
12.2 eLRR proteins 29 Table 33
12.3 Cadherins 13 Table 34
12.4 Neurexins superfamily and neurexin ligands 8 Table 35
  Total 2,890*  

*Not the exact sum of individual numbers because some genes occur multiple times in different categories (auxiliary ion channel subunits—4 duplicates; ciliary components—7 duplicates; Ig/LRR—6 duplicates)

Structural and regulatory genes involved in cytoskeletal organization (e.g., small GTPases) or in basic cellular processes are not considered here since most of them have broad functions in many different cell types and are also sometimes only transiently expressed in the nervous system. Gene regulatory factors are also not considered because a neuronal function is difficult to predict a priori (the only exception being proneural bHLH factors; however, with a few possible exceptions, these factors usually have no function in mature neurons). The reader is referred to Neurogenesis in the nematode Caenorhabditis elegans, which describes gene regulatory factors operating during nervous system development.

The gene lists provided in this review are an update and extension of the first analysis of neurobiology-related gene families in C. elegans genome compiled by Cori Bargmann in the 1998 C. elegans genome issue of Science (Bargmann, 1998). The gene lists also summarize and extend many ensuing sequence analyses of individual gene families, as referenced in the respective sections below. The completeness of the analysis of individual gene families was assessed by a combination of domain searches using SMART, InterPro and Panther databases (Schultz et al., 2000; Zdobnov and Apweiler, 2001; Thomas et al., 2003; McDowall and Hunter, 2011), by analysis gene families as shown in TreeFam (Li et al., 2006) and, if necessary, by re-iterative BLASTP searches. It cannot be excluded that a more sophisticated sequence analyses may reveal additional family members. A substantial number of new gene names were assigned, many of them completely new names, and many in accordance with previously assigned names. For some gene families the numbers provided here differ from those of previous reports and database collection, e.g., InterPro domain databases (used in the description of protein families in Genomic classification of protein-coding gene families). This is because databases are populated by a large number of duplicate entries that either reflect differentially spliced isoforms arising from the same locus or trivial problems in duplicate gene naming. In contrast to these databases, the counts presented in this review rely almost entirely on manual curation of gene families, with the exception of the chemosensory-subfamily of 7TMR genes with ~1,280 members, for which I relied, in large part, on the analysis by Robertson and Thomas described in The putative chemoreceptor families of C. elegans. In many cases, counts presented here also differ from previous analyses because the genome sequence was almost but not entirely complete at the time of previous analyses. In addition, gene predictions have sometimes significantly changed over the years as a result of improved gene predictions and experimental validation through in-depth transcriptome analysis (Gerstein et al., 2010).

The gene lists also include a rough and superficial description of known expression patterns. As mentioned in the individual chapters below, the expression of many genes has been analyzed and neuronal expression has been confirmed (references to expression patterns and individual gene functions are most often not provided directly in the text, but the respective gene names are hyperlinked to Wormbase entries in which function and expression patterns are described in more detail and where references are provided). However, for a substantial number of genes the expression is either unknown or could not be detected in the nervous system using (perhaps incomplete) reporter gene fusion constructs; their inclusion in this compendium is solely based on the potential of the gene to determine specific neuronal properties and should not be considered a documented fact. Genes with important functions in a neuron can also have similar (or distinct) functions in a non-neuronal cell type. More information on individual genes can be found in the hyperlinked Wormbase entries for individual genes, which also provide appropriate references to the literature.

2. Ion channels

Among the key defining features of a neuron are the enormously varied ways to regulate the electrical properties across the cellular membrane, a feat achieved through a variety of different ion channels. Most plasma membrane ion channels in the nervous system come in four distinct topologies which likely evolved independently (Hille, 2001; Jegla et al., 2009) (Figure 1):

figure 1

Figure 1: Topology of the main families of ion channels. Only the pore-forming subunits are shown. Note that the sodium and calcium channels are made from a repeating unit of the potassium channels. All of these ion channel families are found in worms; however, voltage-gated sodium channels and P2X channels are not encoded in the C. elegans genome. The numbers next to the brackets indicate the number of subunits in multimeric channel structures. These are highly schematized drawings that reflect transmembrane topologies; variable sizes in the loop domains are not illustrated.

  1. The voltage-gated family of potassium, sodium and calcium channels. The pore forming α subunit of both voltage-gated calcium and sodium channels contain 24 transmembrane (TM) domains which are 4 repeats of a 6TM motif thought to be derived from ancestral potassium channels (Yu et al., 2005) (Figure 1). 6TM voltage-gated potassium channels, in turn, exist as tetramers, with the total ion channel therefore consisting also of a 24TM topology. Non-voltage-gated TRP channels and cyclic-nucleotide gated (CNG) channels—each of which also displays the 6TM topology—are related to these channels as well, as illustrated in Figure 2 (Yu et al., 2005). These channels are described below in Sections 2.1 (potassium channels), 2.2 (calcium channels), 2.3 (TRP channels) and 2.4 (CNG channels).

  2. The cysteine-loop family of ligand-gated ion channels. These are pentameric channels with each subunit displaying a 4TM topology (Figure 1). These channels, as well as auxiliary subunits for the channels, are described in Section 2.5.

  3. Ionotropic glutamate receptors. Unlike the LGIC-type glutamate-gated anion channels, these are tetrameric cation channels, with each subunit containing four hydrophobic segments, three transmembrane domains, and the P loop that is involved in forming the pore (Figure 1). These channels are described in Section 2.6.

  4. P2X and ASIC channels. These channels are not obviously related by primary sequence, but show structural similarities. They each contain two transmembrane domains, assemble as trimers and form similar pores (Young, 2010). These channels are described in Section 2.7.

The C. elegans genome codes for representatives of all the main families described above, as detailed in the ensuing sections. Within specific families, individual member have been lost in the C. elegans genome, with the most notable absentees being sodium-gated ion channels, P2X channels and HCN channels, as also discussed below.

figure 2

Figure 2: The superfamily of voltage-gated ion channels. This phylogenetic tree shows the diversity of the ‘voltage-gated ion channel’ super-family in metazoan genomes. TRP channels and cyclic nucleotide gated channels are gated by internal ligands or sensory inputs rather than voltage. The ryanodine and IP3 receptors are not shown. Voltage-gated sodium channels and HCN channels are not found in the worm genome. The tree was generated from the minimal pore regions of 143 vertebrate and invertebrate members of the voltage-gated ion channel superfamily. See Figure 1 for overall topology of the voltage-gated superfamily members. For a list of the worm potassium channels, see Table 4. Kv10-12 are the 6TM Eag-like subfamily, KCa is the 6TM Slo and SK family, Kir is the 2TM family and K2P is the two-pore 4TM (TWIK) family. For the voltage-gated calcium channel family, see Table 4, for the TRP family, see Table 7 (TPC is a TRP subfamily with more transmembrane domains and 2 pores; the C. elegans homolog is lov-1), for the CNG family, see Table 8. This figure is reproduced and slightly modified with permission from Yu et al. (2005).

2.1. Potassium channels

2.1.1. The three types of potassium channels

Potassium channels modulate the resting potential of a neuron and are therefore critical determinants of neuronal excitability and synaptic function. A total of 72 potassium channels are encoded in the C. elegans genome. These channels fall into three large structural classes, the 6-transmembrane (6TM), 4TM and 2TM classes (Table 2, see Figure 1 in Potassium channels in C. elegans). All three families are thought to derive from an ancestor with a core 2TM topology (Kir/Kcs class) (Yu et al., 2005). The 4TM channels (TWIK channels, see below) are thought to represent simple duplications of the 2TM topology. The 6TM channels again contain the 2TM core unit but acquired 4 additional, unrelated TM domains (note that the 6TM channel topology constitutes the basic building block of the 24TM calcium/sodium channel class, Figure 1). Even though derived from a common ancestor, potassium channels do not form a homogenous group. Voltage-gated potassium channels of the Eag family (Kv10-12) are more closely related to cyclic-nucleotide-gated channels than they are to other potassium channels (Figure 2).

The most notable feature of C. elegans potassium channels is the large expansion of the two-pore TWIK and TWIK-related channel family (TWIK stands for Tandem of Pore Domains in a Weak Inward Rectifying K+, of which there are 47 members in C. elegans, most of them functionally uncharacterized (Table 2). The human genome contains only around 15 TWK channels. The expression pattern of 20 of the TWIK channels has been examined by reporter gene fusions. Most of them are expressed in the nervous system (Potassium channels in C. elegans) (Table 2).

2.1.2. Auxiliary subunits of potassium channels

Voltage-gated potassium channels often associate with auxiliary subunits (Table 3). One class of such subunits is the single-pass KCNE/MinK family (four genes in mammals). There are four characterized C. elegans KCNE orthologs (mps-1, mps-2, mps-3, mps-4) that are each expressed in individual neuron types (Park et al., 2005). MPS-1, MPS-2 and MPS-3 interact with the voltage-gated potassium channel KVS-1 (Park et al., 2005). MPS-4 associates with the potassium channel EXP-2, and accelerates activation and deactivation in response to changes in voltage (Park and Sesti, 2007). In addition, the genome contains four uncharacterized genes with homology to mps-3 and four genes with homology to mps-2; all likely arose by local duplications (Table 3). Whether any of these proteins are also auxiliary subunits to potassium channels is unclear given the low degree of sequence homology.

There are four uncharacterized C. elegans genes related to the KChIP/KCNIP family of auxiliary subunits of voltage-gated channels (Pongs and Schwarz, 2010) (Table 3). The KChIP proteins, small EF hand proteins of the NCS superfamily, are unusual as they not only serve as auxiliary subunits, but also as transcriptional regulatory proteins (Burgoyne and Haynes, 2012). Curiously, proteins highly similar to type IV dipeptidyl peptidases which are normally involved in neuropeptide processing, have also been shown to be auxiliary subunits of voltage-gated potassium channels (Pongs and Schwarz, 2010). Seven genes in the worm genome (dpf-1 through dpf-7) encode type IV dipeptidyl peptidases, with two of them being by far the most similar to type IV dipeptidyl peptidases (dpf-1 and dpf-2).

There is an uncharacterized C. elegans homolog (sssh-1) of the fly gene sleepless, which codes for a small GPI-anchored Ly-6/neurotoxin superfamily member that regulates the levels, localization and activity of Drosophila Shaker (Pongs and Schwarz, 2010). Even though there is no obvious worm ortholog of the Kvβ/KCNAB auxiliary subunit family (Pongs and Schwarz, 2010), this family belongs to an extended superfamily of aldo/keto-reductases. mec-14, which is thought to encode an auxiliary subunit of the MEC-4/MEC-10 degenerin channel, is a member of this superfamily too (M. Chalfie, pers. comm.).

Clear orthologs of the auxiliary subunit family Bkβ/KCNMB of calcium-activated potassium channels cannot readily be found in the C. elegans genome. The C. elegans BK channel slo-1 appears to rather use a small protein with a single transmembrane domain (bkip-1 for “BK channel interacting protein”) as auxiliary subunit (Chen et al., 2010). bkip-1 has no paralogs in C. elegans and no obvious orthologs outside nematodes.

Sulfonylurea receptors (SURs), are auxiliary subunits of the inwardly rectifying Kir family of potassium channels in vertebrates and are members of subfamily C of ABC transporter family (official names—ABCC8 and ABCC9). There are nine members of the ABCC subfamily in worms (Zhao et al., 2007) (Table 3), yet unlike Drosophila, the worm genome does not contain an obvious ortholog of the ABCC8/9 subfamily. Other ABCC subfamily members may have adopted the auxiliary potassium channel subunit function, with perhaps ctf-1 being the best candidate (Table 3).

TWIK channels might also rely on auxiliary subunits. A multipass transmembrane protein, UNC-93, co-localizes with the TWIK channel SUP-9 and is required for its function (de la Cruz et al., 2003). UNC-93 is phylogenetically conserved and is part of a larger family of 17 related C. elegans proteins that are presently uncharacterized (de la Cruz et al., 2003) (Table 3). This family has expanded in C. elegans mirroring the expansion of TWIK channels. Another transmembrane protein required for SUP-9/TWIK function, called SUP-10, may also be a auxiliary subunit (de la Cruz et al., 2003), but is not phylogenetically conserved and there are no worm paralogs.

2.2. Calcium channels, transporters and calcium binding proteins

Calcium is a broadly used signaling molecule, but it also has several specialized functions in the nervous system, e.g., in synaptic vesicle release, in modulation of ion channel activity and, of course, as an ion that is itself involved in generating currents across excitable membranes in neurons. This is an absolutely critical feature of calcium since C. elegans does not generate sodium-based action potentials (Goodman et al., 1998). In this section, I will not only summarize calcium channels but also cover other genes related to “neuronal calcium”.

The molecular biology of neuronal calcium is briefly summarized in Figure 3 (Grienberger and Konnerth, 2012). Some calcium-permeant channels, namely nAChR-type receptors and glutamate receptors (NMDA, Kainate and AMPA-type), are discussed in an ensuing section (Section 2.5) and so are metabotropic receptors that signal to mobilize intracellular calcium stores (Section 5.1).

figure 3

Figure 3: Neuronal Calcium Signaling. Proteins that control calcium influx and efflux. Numbers in red circles reflect the number of C. elegans genes in each category, although some of the homologs may not be expressed in the nervous system. Sources of calcium influx are nicotinic acetylcholine receptors (nAChR; 61 C. elegans genes; Table 9), AMPA and NMDA-type glutamate receptors (at least 10 C. elegans genes; Table 11), transient receptor potential type C channels (TRPC; 3 C. elegans genes; Table 7) and voltage-gated calcium channels (VGCC; 9 C. elegans genes; Table 4). Calcium release from internal stores is mediated by inositol trisphosphate receptors (IP3R; 1 C. elegans gene) and ryanodine receptors (RyR; 1 C. elegans gene). Inositol trisphosphate can be generated by metabotropic glutamate receptors (mGluR; 5 C. elegans genes; Table 19) as well as by other Gq coupled GPCRs. Calcium efflux is mediated by the plasma membrane calcium ATPase (PMCA; 3 C. elegans genes), the sodium-calcium exchanger (NCX; 10 C. elegans genes; Table 5), and the sarco-endoplasmic reticulum calcium ATPase (SERCA; 1 C. elegans genes). Intracellular calcium is sensed and buffered by calcium binding proteins, of which there are many dozens in the worm genome (Table 6). Mitochondria also play important roles in neuronal calcium homeostasis; the C. elegans calcium uniporter is encoded by mcu-1. This figure is a modified version of a figure taken from (Grienberger and Konnerth, 2012).

2.2.1. Voltage-gated calcium channels and auxiliary subunits

Voltage-gated calcium channels (VGCCs) are composed of a pore-forming unit, the 24TM domain-containing α1 subunit, and are usually associated with auxiliary β subunits and α2δ subunits. There are five α1 subunits in the worm genome, two α2δ subunits and two β subunits (Table 4). The role of the γ subunit, a family of tetraspanin molecules, remains unclear. These molecules (two of them exist in the C. elegans genome, stg-1 and stg-2) are now thought to have a major role in AMPA glutamate receptor biology, as mentioned in Section 2.6.

α1 subunits come in three families, Cav1, Cav2 and Cav3. These correspond to the physiologically defined L-type (‘long-lasting’), N-type (‘Non-L’ or ‘neuronal’, includes the P, Q and R types) and T-type (‘transient’) channels (Catterall et al., 2005). Mammals possess several subtypes of each channel type differing in tissue and subcellular distribution. Only single genes for each type are found in invertebrates such as C. elegans (Table 6). Specifically, egl-19 codes for the L-type, unc-2 for an Non-L-type and cca-1 for an T-type channel.

In addition, C. elegans contains two members of the α1U branch of invertebrate and vertebrate cation channels (nca-1 and nca-2), which are more distantly related to the α1 type. The channels require two phylogenetically conserved auxiliary proteins for their correct localization, encoded by unc-79 and unc-80 (Humphrey et al., 2007; Jospin et al., 2007). There are no obvious paralogs of unc-79 or unc-80 in the worm genome.

2.2.2. Other calcium channels

Two types of channel proteins are involved in mobilizing calcium from intracellular stores (Figure 3). Ryanodine receptors (RyRs) represent a class of intracellular calcium channels with prominent roles in excitable cells like muscles and neurons. In vertebrates, there are three RyRs: two for different types of muscle, and one expressed more broadly but most predominantly in the brain. There is a single RyR in C. elegans, encoded by the unc-68 locus. Expression analysis originally localized the protein to muscle, but more recent studies show that the gene also functions in neurons (Liu et al., 2005). There is also a single IP3 receptor, another intracellular calcium channel, encoded by the itr-1 gene. It is expressed in the intestine but also in some neurons and muscle (Table 5).

ORAI/CRAC ion channels are unusual 4TM, tetrameric plasma membrane channels that are activated by depletion of intracellular calcium stores. This activation works through an ER-resident calcium sensor STIM1 (an EF hand protein) that is directly linked to the plasma membrane channel (Figure 3). The C. elegans genome codes for one ORAI ortholog, orai-1 and one STIM1 ortholog, stim-1. They operate in reproductive tissue (Strange et al., 2007) and their function in the nervous system has not yet been explored.

2.2.3. Calcium transporters

Cytosolic calcium concentrations are controlled by sodium-coupled transporters of the SLC8 and SLC24 families (Figure 3). In vertebrates, many of these proteins are expressed strongly in the brain and have various brain-specific functions (Lytton, 2007). There are three members of the SLC8 family in worms (ncx-1 through ncx-3 for “Na+/Ca++ exchangers”) and seven members of the SLC24 family (ncx-4 through ncx-10) (Table 5). None of these transporters have yet been investigated for expression or function. There are also ATPases that transport calcium across the plasma membrane. There are three such ATPases in worms: mca-1 (expressed in excretory cell), mca-2 (hypodermis) and mca-3 (many tissues including neurons). A single homolog of the SERCA-type sarco-endoplasmic reticulum Ca++ ATPase, sca-1, exists in worms (Figure 3).

2.2.4. Calcium binding proteins

Intracellular calcium binds to proteins via a number of different motifs, the most prominent being the small EF hand motif (other calcium binding motifs, such as the C2 domain also have other binding partners). A number of vertebrate EF hand proteins, calbindin, calretinin and parvalbumin, have served as “classic” markers for specific neuron types in the vertebrate nervous system. One family of EF hand proteins, the NCS (“neuronal calcium sensor”) family (14 genes in mammals) has many specialized functions in the nervous system, often relating to ion channel regulation (Burgoyne and Haynes, 2012). Generally, EF hand proteins are thought to act as either “sensor” proteins that respond to calcium with a conformational change that triggers downstream events or as “buffer” proteins that control local calcium concentration; that distinction is, however, beginning to blur (Schwaller, 2009).

There are more than 100 genes in the worm that code for easily recognizable EF hand containing proteins (by contrast, humans are thought to have several hundred), many of them with very broad cellular functions. Given plenty of precedents, the most likely candidates of these genes for neuron-specific functions are those that exclusively code for EF hands and no other domains. C. elegans contains 64 of such genes (Table 6). There is one ortholog of classic calmodulin (cmd-1), eight calmodulin-related genes (there are many calmodulin-related genes in humans, too) and seven members of the NCS family of calcium sensor proteins (14 in humans), including homologs of human NCS-1 and the KChIP/DREAM proteins (mentioned above in the context of their role as K+ channel auxiliary proteins). There are 48 additional genes that code for proteins that exclusively contain EF hands and no other domains (Table 6). Many of them are C. elegans orthologs of well-characterized mammalian proteins with well-documented roles in the nervous system, but among them are also 16 genes with no obvious vertebrate homologs. Based on sequence, there are no obvious nematode orthologs of calbindin, parvalbumin or calretinin.

2.3. TRP channels

The TRP (Transient Receptor Potential) superfamily of cation channels are evolutionarily related to voltage-gated ion channels: they contain six transmembrane domains and a pore loop between the fifth and sixth transmembrane domains. However, TRP channels are generally not activated by voltage, but rather by a remarkable diversity of ligands or sensory inputs (Kahn-Kirby and Bargmann, 2006; Venkatachalam and Montell, 2007).

TRP channels fall into distinct classes based on overall sequence features (Yu et al., 2005) (Figure 2). The C. elegans genome contains 23 genes that display similarities to TRP channels. 17 of them are canonical TRP channels which fall into the TRPA, TRPC, TRPM, TRPML, TRPN, TRPV and TRPP subfamilies (Table 7) (Kahn-Kirby and Bargmann, 2006; Xiao and Xu, 2009) and one is a TRP-related TRPP1-type protein, LOV-1 (an 11-transmembrane domain protein). Five genes code for uncharacterized, multipass-transmembrane paralogs of a nematode-specific expansion (named trpl for TRP-like) (Table 7). trpl genes show relatively little sequence similarity to TRP channels, but do contain sequence signature motifs found in TRPM channels (Panther domain PTHR13800 “TRP, SUBFAMILY M”). The human genome encodes 28 TRP channel genes. Many of the C. elegans genes have been functionally analyzed and most are expressed in the nervous system. The so-far-characterized neuronally expressed TRP channels function as thermosensors, mechanoreceptors, proprioceptors or transduce signals in olfaction (Xiao and Xu, 2009).

2.4. Cyclic nucleotide-gated ion channels

Cyclic nucleotide gated (CNG) ion channels are signal-transducing cation-selective ion channels that form tetramers using specific combinations of α and β-type subunits. Even though they are not voltage-gated, they are members of the superfamily of voltage-gated ion channels (Yu et al., 2005) (Figure 2). C. elegans contains a total of six CNGs (Table 8). One, tax-4, encodes a canonical α subunit, whereas tax-2 encodes a canonical β subunit and both have been involved in various sensory paradigms (see Chemosensation in C. elegans). Vertebrates also contain six CNG channels, and all are α- or β-type. Four additional C. elegans CNGs, cng-1, cng-2, cng-3 and che-6, encode neither clear α or β subunits but display a somewhat higher sequence affinity to α subunits. The expression of most of the six CNGs has been investigated, revealing expression in partially overlapping subsets of sensory neurons.

As mentioned above, hyperpolarization-activated channels (HCNs) are related in sequence to the CNGs but, in contrast to flies and vertebrates, the C. elegans genome contains no HCN orthologs (Figure 2).

2.5. Ligand-gated ion channels

Neurotransmitters signal via two types of receptors: ion channels, also called ionotropic receptors (this section), and G-protein-coupled receptors (GPCRs), also called metabotropic receptors (Section 5.1 below).

Most ligand-gated ion channels (LGICs) in the C. elegans genome fall into the cysteine-loop superfamily of ion channels, which are characterized by the presence of a disulfide bond between two invariant cysteine resides in an extracellular loop region (Figure 1). Cys-loop LGICs consist of five homologous subunits arranged in a homomeric or heteromeric manner around a central pore (Sine and Engel, 2006). In mammals, the LGIC superfamily consists of about 45 genes, insects have just over 20 such genes, but the C. elegans genome contains 102 LGIC subunit-encoding genes (Jones and Sattelle, 2008) (Figure 4). Members of this C. elegans gene family include cation-permeable acetylcholine receptors related to vertebrate nicotinic acetylcholine receptors (nAChRs, see Section 2.5.1), anion-permeable GABA receptors related to vertebrate GABAA receptors (Section 2.5.2), and glutamate-gated anion channels (Section 2.5.2) related to channels found widely in invertebrate species (Jones and Sattelle, 2008). In addition, C. elegans contains LGICs not yet identified in vertebrates and insects including anion channels gated by acetylcholine or biogenic amines (serotonin, tyramine, dopamine) (Ringstad et al., 2009), and possibly other ligands. Of the many additional orphan LGICs (all termed lgc genes) several fall into broad families, but it is unknown how they are gated (Jones and Sattelle, 2008).

A phylogenetic analysis of the LGIC superfamily from various nematode and non-nematode species (Rufener et al., 2010) reveals that the above-mentioned groups fall into two large blocks, as illustrated in Figure 4: a very large group of the nAChR-related genes (including vertebrate and C. elegans bona-fide nAChRs, as well as many “orphan” genes, Section 2.5.1) and a block of non-nAChR-type genes (Section 2.5.2). Characterized members of the former block are cation channels and characterized members of the latter block (with the exception of exp-1) are anion channels.

figure 4

Figure 4: Phylogenetic analysis of cysteine-loop ligand-gated ion channels. This phylogenetic tree was generated from the ligand-binding domains of 1426 putative LGIC genes. Genes were identified by a BLAST search using 210 seed sequences and then refined using Genewise. A thousand bootstrap iterations were performed and branches below 50% bootstrap support were collapsed. Nematode sequences are shown in shades of green, platyhelminthes in yellow, insects in purple and vertebrates in red. C. elegans subunit names are labeled in green. This figure is adapted with permission from Rufener et al. 2010 with changes in the coloring scheme.

2.5.1. nAChR-type ligand-gated ion channels of the Cys-loop LGIC superfamily

In C. elegans, the group of nAChR-type ligand-gated ion channels of the Cys-loop LGIC superfamily consists of 61 diverse genes (almost 3 times as many as in mammals), some of them well-characterized (Table 9). These genes can be divided into subgroups (Jones et al., 2007). As Figure 4 illustrates, the most striking subgroups are the UNC-29 subgroup (four genes—unc-29, lev-1, acr-2, acr-3), the UNC-38 subgroup (six genes—unc-38, unc-63, acr-6, acr-8, acr-12, acr-13) and the DEG-3 subgroup (eight genes—deg-3, des-2, acr-5, acr-17, acr-18, acr-20, acr-23, acr-24). Each of these subgroups contains functionally characterized nAChR channel subunits. Notably, though, a heteromeric channel composed of DEG-3 and DES-2 proteins (encoded by an operon) appears to be sensory receptor that responds to ambient choline (Yassin et al., 2001).

Within the large number of uncharacterized lgc genes in this group, additional subgroups can be observed (Jones et al., 2007), including some obvious recent duplications, creating very close paralogous gene pairs (e.g., lgc-7 and lgc-8, lgc-16 and lgc17). Two members of this diverse group (pbo-5 and pbo-6) function as proton-gated ion channels (Beg et al., 2008), illustrating the wide range of gating-mechanisms for orphan members of this group. The expression pattern is not known for most of the orphan lgc genes (Table 9).

2.5.2. Other ligand-gated ion channels of the Cys-loop LGIC superfamily (GABA, glutamate and others)

The group of LGICs that is phylogenetically distinct from the nAChR group contains 41 genes (Sine and Engel, 2006; Rufener et al., 2010) (Figure 4, Table 10). This group has extensively radiated and diversified in worms compared to humans where this group consists of 19 relatively close related GABAA receptor-encoding genes and 5 glycine receptor coding genes (Tsang et al., 2007). Glycine receptor genes are thought to be a vertebrate specific invention. The 41 C. elegans genes can be broadly subdivided into several subgroups based on sequence similarity. With one exception (exp-1) all characterized members of this group are anion channels:

  1. GABA-gated ion channel subgroup. This subgroup, consisting of seven genes, codes for canonical GABA-gated chloride channels. Some of these genes are closely related to vertebrate GABAA receptors. The members are gab-1, unc-49, lgc-36, lgc-37, lgc-38 and the more distant paralogs exp-1 and lgc-35. unc-49, gab-1 and exp-1 encode bona fide GABA-gated channels (see the WormBook chapter GABA). exp-1 is the odd man out in this overall LGIC group since it is the only cation channel.

  2. Inhibitory, ACh-gated chloride channel subgroup. This subgroup contains eight genes, including the four electrophysiologically characterized acc-1 through acc-4 genes (Putrenko et al., 2005), as well as the currently uncharacterized lgc-46, lgc-47, lgc-48, and lgc-49 genes.

  3. Biogenic amine-gated subgroup. This subgroup contains eight genes including mod-1, which encodes an electrophysiologically characterized serotonin-gated chloride channel, lgc-55 which encodes a tyramine-gated chloride channel and lgc-53, encoding a dopamine-gated chloride channel (Pirri et al., 2009; Ringstad et al., 2009). The ligands for the remaining channel-encoding genes—lgc-50, lgc-51, lgc-52, lgc-54 and ggr-3 (whose name, GABA/Gly receptor, is a bit of a misnomer as it displays no specific affinity to ggr-1 and ggr-2, which are in a different and possibly GABA-gated subgroup)—are not yet identified. Vertebrates also have serotonin-gated channels, but those are cation-selective, and not anion-selective like MOD-1.

  4. Glutamate-gated anion channels subgroup. This invertebrate-specific subgroup contains six genes—glc-1 through glc-4, avr-14, avr-15 (see Ionotropic glutamate receptors: genetics, behavior and electrophysiology). Receptors encoded by these genes are ivermectin-sensitive. They have been speculated to be the invertebrate homologs of glycine receptors (Vassilatis et al., 1997). From the ligand perspective, note that this family is one of two types of glutamate-gated ion channels in the C. elegans genome. The other type is unrelated to the pentameric LGICs and contains glutamate-gated cation channels related to vertebrate AMPA/kainate/NMDA receptors. These are discussed in Section 2.6.

The remaining 12 members of this group contain nine genes that are related (ggr-1, ggr-2, lgc-39 through lgc-45) and 3 genes that show no affinity to any subgroup (lgc-32, lgc-33, lgc-34). The LGC-40 channel has been shown to be a low-affinity serotonin receptor that is also gated by choline and acetylcholine (Ringstad et al., 2009).

All genes are listed in Table 10. The expression of most family members is not known.

2.5.3. Auxiliary subunits of LGICs

LGICs require auxiliary subunits for their trafficking, assembly and function. The best characterized auxiliary subunits are those for the nAChRs and many of them were first identified through functional analysis in C. elegans (Table 3). These include the unrelated genes ric-3, unc-50, and unc-74 (Boulin et al., 2008), as well as nra-2 and nra-4, which encode ER-resident type I transmembrane proteins (Almedom et al., 2009). With the exception of nra-2, which is related to the nicastrin-encoding aph-2 gene, none of these genes have additional paralogs in the C. elegans genome. LEV-9, a protein with multiple Sushi/CCP domains is an additional auxiliary subunit identified by functional analysis (Gendrel et al., 2009). The gene adjacent to lev-9, T07H6.4, encodes a protein with the same domain composition as LEV-9 but its function is unknown. Another nAChR auxiliary subunit protein, LEV-10, contains several CUB domains, an LDL domain, and a transmembrane domain. There are three more genes in the genome coding for proteins with a similar domain architecture: mig-13, neto-1 (the ortholog of vertebrate Neto1/2), and K05C4.11. To date, mig-13 has only been implicated in cell migration, not AChR function, while neto-1 and K05C4.11 are uncharacterized. An alternatively spliced form of the lev-10 locus, called eat-18, is required for cholinergic transmission in the pharynx (McKay et al., 2004).

A one-pass transmembrane protein, MOLO-1, that contains a single extracellular globular domain, the TPM domain, was recently found to be a new auxiliary subunit of nAChR (Boulin et al., 2012). The worm genome contains six molo-1 paralogs (Table 3). Vertebrate GPI-anchored or transmembrane Lynx/SLURP proteins have also been implicated in nAChR function (Jones et al., 2010). These proteins contain a characteristic LU (“Ly-6 antigen / uPA receptor”) domain. There are four C. elegans genes (lurp-1 through lurp-4) encoding proteins with a similar domain architecture, all of them uncharacterized to date (Table 3). In addition, the C. elegans genome contains 10 proteins with homology to the Ly6 domain (InterPro domain IPR010558). They all appear to originate from a nematode-specific expansion. The founding member of this family, ODR-2, was identified by its involvement in odortaxis (Chou et al., 2001). The nine paralogs of ODR-2 are called hot genes (for “homologs of odr two”) (Table 3). Although their mechanism of action is not known, the homology of all these proteins with the Lynx/SLURP-type regulators of LGICs as well as the documented neuronal function of ODR-2, suggest that these worm proteins could function as regulators of LGICs.

Even though not considered an auxiliary subunit per se, the rapsyn protein is required for clustering of nAChRs on vertebrate skeletal muscle. The C. elegans genome contains one functionally conserved rapsyn ortholog, rpy-1, which is expressed in both muscle and neurons (Nam et al., 2009). Another nAChR clustering protein is the secreted OIG-4 protein, which is composed of a single immunoglobulin (Ig) domain (Rapti et al., 2011). C. elegans has five secreted 1-Ig domain proteins (oig-1 through oig-5).

2.6. Ionotropic glutamate receptors

2.6.1. Two types of glutamate-gated ion channels

As mentioned briefly above, two types of glutamate-gated ion channels are encoded in the C. elegans genome. One group consists of inhibitory glutamate-gated anion channels, which are members of the Cys loop LGIC family and which have been described above (Table 10). The second group is composed of the highly conserved glutamate-gated cation channels (“ionotropic glutamate receptors” or iGluRs). These glutamate receptors are tetrameric and related to the AMPA, Kainate and NMDA receptors in vertebrates. There are ten subunits encoded in the C. elegans genome. Two of them form NMDA receptor-type channels (encoded by nmr-1 and nmr-2) and eight form AMPA receptor-type channels (encoded by glr-1 through glr-8) (Table 11). All of these genes are expressed in distinct and partly overlapping sets of neurons (see Ionotropic glutamate receptors: genetics, behavior and electrophysiology).

In addition, there are five related and as yet unnamed genes in the genome whose protein products share homology with the AMPA-type glr genes (e-value in BLAST search 1e-04 to 5e-09) (Table 11). They all contain predicted ligand-binding domains related to solute-binding domains in bacterial amino acid-binding proteins. They have several transmembrane segments, but tend to code for smaller proteins than the NMR/GLR proteins. These genes, as well as the more canonical glr genes glr-7 and glr-8, may belong to a newly defined subtype of iGluRs—termed ionotropic receptors (IRs)—that serve as chemosensory molecules in flies (Croset et al., 2010). Three C. elegans genes (glr-7, glr-8, W02A2.5) fulfill sequence criteria to be IR genes, and two of these are expressed in pharyngeal neurons, suggesting roles in food sensing (Croset et al., 2010). It is interesting to remember here the above-mentioned LGIC proteins DEG-3 and DEG-2 that serve as sensory channels for ambient choline. Perhaps it is a general feature of different types of ion channel families to be employed as sensory receptors for ambient metabolites.

2.6.2. Auxiliary subunits for AMPA-type glutamate receptors

Ionotropic glutamate receptors require a number of distinct auxiliary transmembrane proteins collectively called TARPs (for transmembrane AMPA receptor regulatory proteins) (Jackson and Nicoll, 2011). The C. elegans genome contains the TARP sol-1, which codes for a CUB domain protein, and stg-1 and stg-2, which code for proteins related to the vertebrate TARP stargazin. The vertebrate CUB/LDL/TM proteins Neto1 and Neto2 also function as TARPs and, as mentioned above in the context of nAChR auxiliary subunits, there are a total of four Neto1/2-like proteins encoded in the C. elegans genome (besides a Neto1/2 ortholog, neto-1, there are lev-10, mig-3 and K05C4.11). C. elegans also contains an uncharacterized homolog of the vertebrate TARP Cornichon (Jackson and Nicoll, 2011), cni-1, but lacks obvious homologs of the SynDIG1 or CKAMP44 TARPs (Table 3).

2.7. DEG/ENaC/ASIC channels

2.7.1. Channels

DEGenerin/Epithelial Na+ Channels/Acid sensing ion channels (DEG/ENaC/ASIC) constitute, together with the related P2X channels, the fourth type of ion channel superfamily (Figure 1). P2X-type ion channels, which are directly activated by adenosine triphosphate (ATP) (Fountain and Burnstock, 2009), can be found in all vertebrate species, in marine invertebrate species like mollusks and sea urchins, and even in fungi, but they appear to have been lost in C. elegans and Drosophila (Bavan et al., 2009).

The related DEG/ENaC/ASIC channels have been implicated in a broad spectrum of cellular functions and can be gated by a variety of distinct mechanisms, ranging from mechanosensory stimuli to pH to small ligands, such as FMRFamide peptides (see Mechanosensation; Bazopoulou and Tavernarakis, 2007). Individual proteins cross the membrane twice, have intracellular N- and C-termini and a large extracellular loop that includes a conserved cysteine-rich region. Their multimeric state was initially controversial, but recent work suggests that they are trimers (Jasti et al., 2007; Gonzales et al., 2009). The naming of this class in the literature is not always consistent: DEG, ENaC and ASIC channels are specific subtypes of these receptors and sometimes the entire family is either referred to only as ASIC or as DEG/ENaC. I refer to them here with all three names.

With a total of 30 members (Table 12), C. elegans has expanded its repertoire of DEG/ENaC/ASIC channels significantly compared to ~10 genes in mammals. There are no specific ortholog pairs of vertebrate and worm channels, suggesting independent radiation of this gene family (Bazopoulou and Tavernarakis, 2007). Even though some of their names (Table 12) may suggest otherwise, none of the C. elegans members are more closely related to vertebrate ASIC or ENaC proteins. Nevertheless, domain analysis as well as the clustering in phylogenetic trees (TreeFam TF317359) suggest that some but not all of the 30 genes fall into related subgroups (Bazopoulou and Tavernarakis, 2007) (Table 12). One of these subgroups, the “egas” subgroup, contains a peculiar domain combination of the signature ASC domain (present in all superfamily members) and multiple EGF repeats. The only other clade in which such a combination can also be found are hemichordates. No expression patterns have yet been reported for this subfamily.

Almost half of the DEG/ENaC/ASIC channels have been characterized for expression or function. With two exceptions (flr-1 and unc-105) they are all expressed in the nervous system or have specific neuronal functions (Table 12).

2.7.2. Auxiliary subunits of DEG/ENaC/ASIC channels

Stomatins are membrane proteins thought to be auxiliary subunits that modulate the activity of DEG/ENaC/ASIC channels both in worms and vertebrates (Lapatsina et al., 2012). They are defined by the presence of a characteristic and structurally conserved core domain called the stomatin or SPFH domain (Stomatin, Prohibitin, Flotillin, HflK/HflC) domain. There are five mammalian stomatin genes and ten C. elegans stomatin-like genes (mec-2, unc-1, unc-24, stl-1, sto-1 through sto-6), most of them originating from an apparently nematode-specific expansion (Table 3). Even though only explicitly demonstrated to be an auxiliary subunit for MEC-4/MEC-10 degenerin channels, the co-localization of UNC-1 protein with an innexin protein (Chen et al., 2007) suggests that stomatins may also be auxiliary subunits for different types of transmembrane channels (see Section 10 for innexins). This is consistent with the physical association of vertebrate stomatin-like proteins with a TRP channel (Lapatsina et al., 2012). The expression patterns of five of the ten stomatins have been analyzed and neuronal expression was detected for each of them. The MEC-4/MEC-10 DEG channel complex not only employs a stomatin as auxiliary protein, but also an oxidoreductase-related protein, MEC-14.

2.8. Chloride channels and chloride transporters

2.8.1. Chloride channels

Plasma membrane-localized chloride channels are molecularly diverse and have many distinct functions in the nervous system. Besides the neurotransmitter-gated chloride channels mentioned above (Section 2.5.2), there are a number of additional chloride channels, some only recently identified as such (Duran et al., 2010). In a good number of cases, the distinction between chloride channels and transporters is blurry.

One major type of chloride channel is the CLC superfamily, members of which control the membrane potential of cells. Some CLC channels are voltage gated, while others function as chloride/proton exchangers. The C. elegans genome contains six members of the phylogenetically very ancient family of CLC chloride channel proteins (Schriever et al., 1999) (Table 13).

C. elegans contains a calcium-regulated chloride channel of the Tweety family, ttyh-1. The channel conducts large chloride (“maxi-Cl-”) currents. C. elegans ttyh-1 is expressed widely throughout the nervous system, but has not yet been functionally characterized. Vertebrate Tweety was recently found to be associated with synaptic vesicles (Morciano et al., 2009). C. elegans also contains two members of the recently defined anoctamin family of calcium-activated chloride channels, both also presently uncharacterized (anoh-1 and anoh-2) (Table 13).

Bestrophins are another family of plasma membrane-located, calcium-activated chloride channels (four genes in mammals) (Duran et al., 2010). Bestrophins are expressed in multiple vertebrate tissue types including the nervous system. C. elegans has significantly expanded its repertoire of these bestrophin-like genes: there are 26 family members, an expansion by a factor of more than six compared to mammals (Table 13). All proteins share a homology region (“Bestrophin” or “RFP-TM” domain) of 350-400 amino acids. Two of the three C. elegans genes whose expression has been analyzed so far with reporter genes show expression in the nervous system (Table 13).

There are no obvious worm (or fly) homologs of calcium-activated chloride channels of the CLCA family.

2.8.2. Chloride transporters

The very large superfamily of SLC “solute carrier” transporters (>100 genes, mentioned again in Section 3 in the context of neurotransmitter transporters) includes one family, the SLC12 family, which transports chloride across membranes and which has important functions in the nervous system. Their importance—and the reason why they are mentioned here in the context of ion channels—stems from the fact that intracellular chloride concentration determines the strength and polarity of inhibitory neurotransmitters that act on chloride channels such as GABA (Hebert et al., 2004) (Figure 5). Specifically, in vertebrates the relative expression levels of the K+/ Cl- cotransporter KCC2 (SLC12A4-7 subfamily) and the Na+/K+/2Cl- cotransporter NKCC1/2 (SLC12A1,2 vertebrates) determine whether neurons respond to GABA (or other transmitters) with a depolarizing, excitatory response or with a hyperpolarizing, inhibitory response (Hebert et al., 2004). There are three homologs of the vertebrate KCC family in worms (kcc-1, 2, and 3) and one member of the sodium potassium chloride cotransporter Nkcc (nkcc-1) (Tanis et al., 2009) (Table 5), compared to 4 Kcc genes and 2 Nkcc gene in humans. C. elegans kcc-2 is indeed required to determine the inhibitory action of various neurotransmitters and is expressed in the nervous system (Tanis et al., 2009). nkcc-1 expression and function has not yet been reported. C. elegans has two more distant homologs of the NKCC/SLC12A1-3-type Na+/K+/2Cl- cotransporter, F10E7.9 and B0303.11, which are expressed in neurons and the excretory system, respectively (Table 5).

figure 5

Figure 5: Chloride fluxes across neuronal membranes. The nkcc, kcc and abts genes code for members of the SLC superfamilies SLC12 and SLC4 (see Table 5). NKCC mediate Cl influx by coupling transport to the Na+ gradient, KCCs mediate Cl efflux by coupling transport to the K+ gradient, and ABTS transporters mediate Cl efflux and acid extrusion (Bellemer et al., 2011). As shown in Table 5, each of the SLC genes has multiple paralogs in the worm genome. The genes nkcc-1, kcc-2 and abts-1 have been shown to mediate these functions (Tanis et al., 2009; Bellemer et al., 2011). The direction of chloride flow through ligand-gated anionic channels like the GABA receptors (one representative shown here) is dictated principally by the ion concentration gradients produced by the three types of transporters. This figure is adapted from (Bellemer et al., 2011).

There are two additional members of the SLC12 chloride transporter family, SLC12A8 and SLC12A9, and C. elegans contains an ortholog of SLC12A9 (T04B8.5), which is expressed in neurons and muscle (Table 5). SLC12A9 is thought to modulate the activity of the related Nkcc transporter (Caron et al., 2000).

Apart from the SLC12 subfamily, the sodium-dependent chloride/bicarbonate transporters of the SLC4 family are known to regulate chloride balance and pH in the nervous system (Bellemer et al., 2011) (Figure 5). There are four SLC4 members in the worm genome (abts-1 through abts-4), and all four are expressed in the nervous system, some of them only in a subset of neurons (Sherman et al., 2005). One of them, abts-1, has been directly implicated in inhibitory neurotransmission (Bellemer et al., 2011).

2.9. New ion channels

New types of ion channels are still being discovered. Two entirely new cation non-selective, plasma membrane channels with a >30 transmembrane topology were identified in 2010 in vertebrates, called Piezo1 and Piezo2 (Coste et al., 2010). More recently Piezo proteins were shown to be the pore forming unit of a new type of mechanoreceptor (Coste et al., 2012). These channels bear no obvious homology to any other type of ion channels. C. elegans contains a single ortholog of the Piezo family (T20D3.11 fused to C10C5.1). Given the Piezo family precedent it would not be surprising if more ion channels remain to be identified.

2.10. Summary of absent ion channels

In summary, genome sequence analysis shows that the following types of ion channels are notably absent in C.elegans: voltage-gated sodium channels, glycine-gated ion channels, P2X channels, and HCN channels. Note that the absence of voltage-gated sodium channel is not generally indicative of an absence of classic action potential in C. elegans since all-or-none action potentials can, at least in C. elegans muscle, be generated by voltage-gated calcium channels as well (Gao and Zhen, 2011; Liu et al., 2011). In most cases the absence of the channel is considered to be a loss since it is paralleled by the absence of the channel in some but not all invertebrates (P2X channels, HCN channels, voltage-gated sodium channels). In one case (glycine-gated LGIC) the channel may have only originated in the vertebrate lineage (Tsang et al., 2007).

3. Neurotransmitter pathways

The steps of synthesis, vesicular loading and reuptake of individual neurotransmitters are referred to as a “neurotransmitter pathway”. C. elegans is known to use as neurotransmitters acetylcholine, GABA, glutamate, serotonin, dopamine, octopamine and tyramine (and most likely more, such as melatonin), and the respective neurotransmitter pathways are shown in Figure 6A. The reader is referred to other chapters in WormBook that discuss these neurotransmitter systems in more detail (GABA; Biogenic amine neurotransmitters in C. elegans; Acetylcholine). I provide here an overview and summary of the genomic complement of confirmed and speculative neurotransmitter pathway genes (Table 14).

figure 6

Figure 6: Neurotransmitter pathways. (A) Pathways for different neurotransmitters. Numbers in parentheses refer to the number of neurons of that neurotransmitter type in the adult hermaphrodite (Rand and Nonet, 1997). Neurotransmitter identity is inferred from the expression of genes shown in the schematic or by antibody staining (GABA, 5-HT). The identity of all glutamatergic neurons has not yet been determined. Note that after synaptic release, glutamate is taken up not only by presynaptic neurons but also by cells in immediate proximity of neurons, as assessed by the analysis of expression patterns of the glutamate reuptake transporters (Mano et al., 2007). The yellow circles indicate the substrates or cofactors required by individual enzymes (see panel B and C). The GABA degradation pathway generates succinic semialdehyde from GABA via GABA transaminase (gta-1, not yet studied in worms), which is then broken down to succinate by succinic semialdehyde dehydrogenase (SSADH, alh-7 in worms, also not yet studied in worms). The site of action of the degradation pathway likely is within GABA neurons but also in other cell types, as the GABA reuptake transporter snf-11 is expressed not just in GABA neurons but also other in cell types. The two separate degradation pathway for monoamine transmitters, either via the monoamine oxidase MAO-A/B (amx-2 in worms, not yet studied) or the Catechol-O-Methyltransferase COMT (five comt genes in worms), are likely to act cell-autonomously since DA and 5HT reuptakes are restricted to DA and 5HT neurons. (B) Acetylcholine biosynthesis. Acetyl-CoA and choline are the substrates for acetylcholine synthesis. Acetyl-CoA is synthesized by an enzyme, ACLY (acly-1, 2 in worms, both uncharacterized), which is enriched in cholinergic neurons in adult vertebrates. (C) Monoamine synthesis. Pathway for de novo synthesis or recycling of BH4, which is an essential cofactor of TH, TPH and TBH (panel A). BH4 is also used as a cofactor for TBH in the octopaminergic pathway as inferred by cat-4 expression in TBH(+) octopaminergic neurons (unpubl. data). Enzyme names are in red, C. elegans gene names in blue. The SPR (sepiapterin reductase) enzyme belongs to the large superfamily of related short-chain dehydrogenases and reductases (SDRs), and a specific sepiapterin reductase-subtype is too difficult to identify within this family in C. elegans. All other enzymes in the BH4 pathways have clear single orthologs in the worm genome. GTPCH = GTP cyclohydrolase; GFRP = GTP cyclohydrolase feedback regulator protein (a regulatory factor used in serotonergic, but not dopaminergic neurons in vertebrates); PTPS = 6-pyruvoyl-tetrahydropterin synthase; SR = sepiapterin reductase; PCD = pterin-4-alpha-carbinolamine dehydratase; DHPR = dihydropteridine reductase. The BH4 synthetic intermediates are as follows: H2NTP = 7,8-dihydroneopterin triphosphate; PTP = 6-pyruvoyl-5,6,7,8- tetrahydropterin; BH4αC = tetrahydrobiopterin-4α-carbinolamine; qBH2 = quinoid dihydrobiopterin. The content of this panel is adapted from (Deneris and Wyler, 2012).

3.1. Neurotransmitter synthesis

Acetylcholine (ACh) is synthesized from choline by the enzyme choline acetyltransferase (cha-1, Figure 6A). In vertebrates, ATP citrate lyase, which generates the CoA cofactor for the acetyl transfer reaction (Figure 6B), is broadly expressed but becomes largely restricted to cholinergic neurons in the mature nervous system. Two ATP citrate lyase orthologs (acly-1, acly-2) are encoded in the worm genome, both uncharacterized; perhaps one of them has specialized for its role in cholinergic neurotransmission, while the other may perform a more general metabolic function. In vertebrates, choline is not generated de novo in neurons but synthesized in the liver by a specific biosynthetic pathway and then taken up by neurons through the choline transporter ChT. Curiously, C. elegans, like plants and fungi, has a distinct pathway for choline synthesis, the PEAMT pathway, which allows neurons to generate choline cell-autonomously from phospho-ethanolamine (thereby lessening the importance of the ChT homolog cho-1 in C. elegans) (Brendza et al., 2007; Mullen et al., 2007). The key enzymes in this alternative pathway are pmt-1 and pmt-2 (Brendza et al., 2007).

GABA is synthesized from the amino acid glutamic acid by the enzyme glutamic acid decarboxylase (GAD), encoded by unc-25 (Figure 6A). Vertebrates contain several GAD isozymes, but C. elegans only contains one.

Glutamate metabolism in the vertebrate nervous system is remarkably complex and not well explored in C. elegans. Vertebrate neurons do not express a pyruvate carboxylase for de novo synthesis of glutamate but are rather provided with glutamate from support cells (astrocytes), which convert glutamate to glutamine via glutamine synthetase and then provide glutamine to neurons. Neurons then synthesize glutamate from glutamine using glutaminase. In C. elegans, there is one pyruvate carboxylase (pyr-1), 4 glutamine synthetases (gln-1, gln-2, gln-3, gln-5—a nematode-specific expansion) and 3 glutaminases (glna-1, glna-2, glna-3—again a nematode-specific expansion). Whether these biosynthetic enzymes are also differentially expressed in C. elegans neurons and putative support cells is not known.

Most of the synthesis pathways of monoamine neurotransmitters are multistep processes (summarized in Figure 7). For dopamine biosynthesis, tyrosine is first hydroxylated by tyrosine hydroxylase (TH, one gene in C. elegans, cat-2) to produce L-Dopa and for 5-HT synthesis, tryptophan is hydroxylated by tryptophan hydroxylase (TPH, one gene in C. elegans, tph-1) to produce 5-hydroxytryptophan (Figure 6A, Figure 7). Both TH and TPH require a cofactor, tetrahydrobiopterin (BH4), which is generated through a multistep biosynthetic process (Figure 6C). Of the enzymes involved in this process (Figure 6C, Table 14), only the GTP cyclohydrolase encoded by the cat-4 locus has been analyzed to date in worms and is expressed exclusively in serotonergic and dopaminergic neurons, as expected. After their generation by TH and TPH, both L-Dopa and 5-hydroxytryptophan are then decarboxylated by the same amino acid decarboxylase (AAAD), encoded by bas-1, to produce dopamine and serotonin, respectively. Besides bas-1, there are four more genes in the genome that code for AAADs (Hare and Loer, 2004) (Table 14): one of them likely an inactive enzyme, two of them with unknown substrates and the last one being tyrosine decarboxylase (tdc-1) which is utilized to generate tyramine from tyrosine. In one of the two neuron classes that synthesize tyramine (i.e., express tdc-1), tyramine is converted by tyramine β-hydroxylase (tbh-1) to octopamine (Figure 6A, Figure 7).

figure 7

Figure 7: Overview of the biosynthesis of biogenic amines. Shading indicates whether the respective amine has been biochemically detected C. elegans (Pertel and Wilson, 1974; Sulston et al., 1975). Homologs for all biosynthetic enzymes exist in the C. elegans genome, but whether these homologs are utilized for the indicated biosynthetic steps is only known in the following cases: TH = tyrosine hydroxylase, encoded by cat-2; TDC = tyrosine decarboxylase, encoded by tdc-1; TBH = tyramine β-hydroxylase, encoded by tbh-1; TPH = tryptophan hydroxylase, encoded by tph-1; AAAD = aromatic amino acid decarboxylase, encoded by bas-1 for 5HT and dopamine biosynthesis; AANAT = aralkylamine N-acetyltransferase, encoded by anat-1; HIOMT = hydroxyindole-O-methyltransferase, encoded by homt-1. In other cases, indicated with “(?)”, the respective type of enzyme exists in the worm genome but its utilization is not clear: DBH (dopamine β-hydroxylase) function could be carried out by the very related TBH (tyrosine β-hydroxylase, encoded by tbh-1), but tbh-1 is not expressed in dopaminergic neurons and no norepinephrine (noradrenaline) or epinephrine (adrenaline) is readily detectable in C. elegans (Sulston et al., 1975). Aside from the characterized AAADs, bas-1 and tdc-1, there are several uncharacterized AAADs in the genome (hdl-1, hdl-2, basl-1, see text) that could serve to generate the trace amines tryptamine, histidine or phenylethylamine. In regard to PNMT (Phenylethanolamine N-methyltransferase), which generates epinephrine and synephrine in other species, there are three paralogous genes in the C. elegans genome (anmt-1 through anmt-3) which are equidistant to PNMT and its related enzymes indolethylamine N-methyltransferase (INMT) and nicotinamide N-methyltransferase (NNMT). INMT and NNMT are not involved in neurotransmitter metabolism but are generally thought to target xenobiotic compounds. The tyrosine-derived neuromodulators dopamine, norepinephrine and epinephrine are also referred to as “catecholamines” (benzene with two hydroxyl groups and a side-chain amine). Tryptophan-derived neuromodulators are also sometimes referred to as “indolamines” (benzene with nitrogen-containing pyrrol ring). In vertebrates, tyramine, octopamine, synephrine, tryptamine, histamine and phenylethylamine are generally considered trace amines.

Melatonin is a biogenic amine that can act as a neuromodulator in various species (Hardeland and Poeggeler, 2003). Melatonin has been detected in worms and is involved in regulating locomotory behavior (Tanaka et al., 2007). Melatonin is synthesized through the N-acetylation of serotonin by serotonin N-acetyltransferase (called AANAT for arylalkylamine N-acetyltransferase) (Figure 7). There are many N-acetyltransferases encoded in the worm genome and one of them, anat-1, is most closely related to AANAT (Migliori et al., 2011). anat-1 is expressed in several uncharacterized neuron types and is functionally also uncharacterized. N-acetylated serotonin is then converted into melatonin by hydroxyindole-O-methyltransferase (HIOMT, homt-1 in C. elegans), which is also neuronally expressed, but functionally uncharacterized (Tanaka et al., 2007).

3.2. Vesicular transport of neurotransmitters

Vesicular transporters for small-molecule neurotransmitters fall into the phylogenetically conserved SLC superfamily of solute carriers. The SLC18 family is called the “vesicular amine transporter family” (He et al., 2009) and contains the vesicular transporter for biogenic amines (dopamine, serotonin, tyramine, octopamine), encoded by cat-1, and the acetylcholine vesicular transporter, encoded by unc-17 (Table 5).

The SLC32 family (“Vesicular inhibitory amino acid transporter family”) contains as its sole C. elegans member the vesicular GABA transporter, encoded by unc-47. To be localized appropriately within GABA neurons, the UNC-47 protein requires the phylogenetically conserved LAMP (lysosome associated membrane proteins)-like protein UNC-46, which is exclusively expressed in GABA neurons (Schuske et al., 2007). Vertebrate UNC-46 homologs are also expressed in a neuron-type specific manner (David et al., 2007). UNC-46 is distantly related to the more canonical C. elegans LAMP proteins lmp-1 and lmp-2. There are no other obvious paralogs of UNC-46.

The SLC17 family of transporters contains several bona fide neurotransmitter transporters (Reimer and Edwards, 2004) and has significantly expanded in worms. The SLC17 family is subdivided into several subfamilies (Table 5). The vertebrate SLC17A6-8 subfamily is composed of vesicular glutamate transporters (VGluTs). C. elegans has three members of this subfamily: the well characterized eat-4 gene and two closely related, likely VGluTs, called vglu-2 and vglu-3. Mammalian genomes also encode three VGluTs, but the C. elegans genes represent an independent expansion. The vertebrate SLCC17A1-5 subfamily contains vesicular aspartate/glutamate transporters and C. elegans contains one uncharacterized homolog of this subfamily, C38C10.2 (Table 5). The SLC17A9 subfamily contains vesicular nucleotide transporters and C. elegans contains one homolog of this subfamily, vnut-1, which is also uncharacterized. Notably, however, there are no obvious homologs of ionotropic (P2X) or metabotropic (P2Y) purinergic neurotransmitter receptors and as such the substrate for vnut-1 is not clear. In addition, C. elegans contains nine genes (eight of which constitute a C. elegans specific expansion) that are clear SLC17 superfamily members, but show no homology to any specific SLC17 subfamily (Table 5).

Casting the web even wider and considering more divergent SLC17 family members, the expansion of the C. elegans family is even more obvious—there are at least 51 SLC17 members in worms and only nine in humans (Hoglund et al., 2011). Most of this expansion appears to be nematode-specific (see the 43 genes in TreeFam tree TF315412). The role of these additional members in the nervous system, if any, is unknown.

Recent work demonstrated the localization of the concentrative nucleoside transporter CNT2 (an SLC28 family member) on synaptic vesicle membranes in rat (Melani et al., 2012). It is therefore possible that the two worm CNT2 homologs slc-28.1 and slc-28.2 may be involved in vesicular uptake of adenosine (discussed more in Section 3.5).

Another substance present in neurotransmitter-containing vesicles is zinc. Zinc is used in a variety of distinct processes in many cell types but particularly notable is its presence in many glutamatergic vesicles in the mammalian nervous system (Bitanihirwe and Cunningham, 2009). Neurons with zinc-containing glutamatergic synaptic vesicles have been termed “gluzinergic” (Bitanihirwe and Cunningham, 2009). Synaptic zinc modulates the overall excitability of the brain through effects on voltage-gated calcium channels, glutamatergic, GABAergic, dopaminergic and nicotinic receptors (Bitanihirwe and Cunningham, 2009). Zinc is transported into synaptic vesicles through members of the SLC30 family of SLC carriers (10 in humans) (Lichten and Cousins, 2009). There are 12 SLC30 members encoded in the worm genome, six show sequence affinity with specific human SLC30 subtypes and six of them are diverse (Table 5). At least one of them is expressed in a subset of neurons (toc-1). There are two worm orthologs (cdf-2, ttm-1) of the human SLC30A2/3/4/8 subfamily which contains SLC30A3/ZnT3, the best characterized zinc synaptic vesicular transporter.

3.3. Neurotransmitter reuptake

Members of the SLC transporter superfamily mediate the reuptake of a neurotransmitter once it has been released at the synapse (He et al., 2009). Depending on the neurotransmitter system, reuptake of the neurotransmitter (or a break-down product such as choline) occurs exclusively by the presynaptic cell, by adjacent cells, or by a combination of both.

The C. elegans genome contains homologs for all canonical reuptake transporters, based on both sequence and functional analysis (Table 11): one transporter for serotonin (mod-5, SLC6 family); one for dopamine (dat-1, SLC6 family); one for GABA (snf-11, SLC6 family); one for choline, the breakdown product of acetylcholine (cho-1, SLC5 family); and six glutamate plasma membrane transporters (glt genes, SLC1 family). The glutamate transporters are expressed in multiple distinct cell types (similar to the situation in vertebrates where glutamate is taken up not by the neuron but by surrounding tissue) (Mano et al., 2007), while the other transporters are mostly restricted in their expression to the neurons that have produced the transmitter (with the exception of the snf-11 GABA transporter which may be expressed only in a subset of GABAergic neurons) (Mullen et al., 2006).

However, this is not the full story. The SLC6 family, which is called “K+/Cl- dependent neurotransmitter transporter family”, contains the serotonin, dopamine and GABA reuptake transporters as well as 14 additional members, three of them with low sequence similarity and perhaps not acting as transporters (Table 5). One of them, snf-6, codes for a muscle-expressed acetylcholine/choline transporter (Kim et al., 2004) and another, snf-12, is expressed in the hypodermis and involved in an immunity response (its cargo is unknown) (Dierking et al., 2011). The remaining genes are completely uncharacterized, but two of them are expressed in neurons and six of them represent a nematode-specific gene expansion (Table 5). Any of these genes may encode reuptake transporters for known neurotransmitters systems for which no plasma membrane reuptake transporter has yet been identified (e.g., biogenic amines like tyramine, octopamine or trace amines), or for uncharacterized neurotransmitters.

Aside from the Na+/Cl- dependent neurotransmitter reuptake of monoamines by the SLC6 family, alternative monoamine uptake mechanisms are known to exist. The recently identified human plasma membrane monoamine transporter PMAT is a low-affinity, high capacity and Na+/Cl- independent monoamine transporter (Engel et al., 2004) and is a member of the small family of SLC29 transporters (four mammalian members). Another SLC29 family member is also selectively expressed in the mammalian brain (Dahlin et al., 2009) and a knockout of a fly SLC29 family member shows various neurophysiological defects (Knight et al., 2010). There are seven members of the SLC29 family in the C. elegans genome (ent-1 through ent-7) (Table 5). ent-1 and ent-2 are expressed in non-neuronal cells and the remaining genes have not yet been characterized (Table 5).

3.4. Neurotransmitter degradation

Either before or after reuptake several neurotransmitters are degraded (schematically shown in Figure 6A). Acetylcholine is already broken down in the synaptic cleft by acetylcholinesterases (AChE, ace genes) before the breakdown product, choline, is taken up by the presynaptic, cholinergic neuron. While mammals only contain a single AChE gene, C. elegans contains four (Table 14). The four ace genes are expressed in cholinergic neurons as well as other cell types.

Monoamine neurotransmitters such as dopamine and serotonin are removed from the synapse by specific plasma membrane reuptake transporters, as mentioned above. In mammals, two monoamine oxidases (MAO-A and MAO-B) are important for subsequent degradation of monoamine neurotransmitters. The C. elegans genome encodes several proteins with homologies to MAO, with AMX-2 being the most similar to mammalian MAO-A and MAO-B (Table 14). In an alternative degradation pathway the catechol-O-methyltransferase COMT degrades monoamines. In contrast to the single enzyme in humans, there are 5 COMT-like proteins encoded in the C. elegans genome, all uncharacterized (Table 14). None of these genes have been functionally analyzed to date.

In insects MAO activity is weak and instead the major enzyme for monoamine breakdown is serotonin N-acetyltransferase AANAT (anat-1 in worms) (Tsugehara et al., 2007). Whether anat-1, which is involved in melatonin synthesis in worms (Migliori et al., 2011), is also employed for serotonin degradation is not known. anat-1 is expressed in multiple unidentified neurons.

The GABA degradation pathway consists of the enzymes GABA transaminase (GABA-T in vertebrates, gta-1 in worms) and succinic semialdehyde dehydrogenase (SSADH, alh-7 in worms) (Table 14). The genes encoding these enzymes have not been functionally analyzed to date in worms.

3.5. The case for and against other neurotransmitter systems

There are several neurotransmitters in other organisms whose existence in C. elegans is unclear, such as glycine, purines (mainly ATP and ADP), histamine and trace amines. Since electrophysiological methods for measuring neurotransmitter-induced or -modulated currents, are not readily applicable to C. elegans neurons, the only readily available route to identify neurotransmitter systems is to seek homologs of neurotransmitter pathway genes in the genome.

3.5.1. Glycine

Based on sequence homology, the C. elegans genome does contain a vesicular transporter for glycine (unc-47), which also transports GABA. unc-47 is, however, only expressed in those cells that by immunostaining also contain GABA. At most, glycine can therefore only be a co-transmitter with GABA. C. elegans contains no clear ortholog of the glycine reuptake transporter GlyT (SLC6A5), while it does contain a GABA transporter ortholog (snf-11). Like other invertebrates, the C. elegans genome also contains no obvious orthologs of glycine-gated ion channels.

3.5.2. ATP

The C. elegans genome contains an as-yet-uncharacterized gene with similarity to the vesicular transporter for nucleotides (vnut-1) (Sreedharan et al., 2010). After synaptic release, ATP is thought to be hydrolyzed to ADP by ecto-ATPases, of which there are three in the worm genome (mig-23, ntp-1, uda-1), and reuptake may occur via concentrative nucleoside transporters of the SLC28 family (CNT1,2,3 in mammals), of which there are two in the worm genome (slc-28.1 and slc-28.2), both uncharacterized (Table 5). Uptake could also occur via the alternative, equilibrative nucleoside transporter family SLC29 (ENT1, 2, 3, and 4 in mammals; seven worm homologs, ent-1 through ent-7) (Table 5). However, both the ATPases and the nucleoside transporter are generally thought to have broad physiological roles, and their existence in the worm genome can therefore not be taken as strong evidence for the use of ATP as neurotransmitter. Lastly, and perhaps most indicative, no obvious orthologs of ionotropic or metabotropic purine receptors (P2X and P2Y) exist in the C. elegans genome.

3.5.3. Adenosine

Adenosine is not traditionally considered a neurotransmitter, but it has been shown to be involved in modulating neuronal activity through P1-type G-protein coupled receptors (Webster, 2001). Recent work demonstrated excitation-dependent release of adenosine from vertebrate neurons, supporting its role as neurotransmitter (Melani et al., 2012). There is a clear worm ortholog of the adenosine receptors, ador-1, which is equally related to A1, A2 and A3-subtype P1 adenosine receptors. No expression or functional analysis has been reported yet. As noted above, there are two worm homologs of the SLC28 transporters and seven SLC29 transporters, both of which transport nucleosides like adenosine across membranes. Intriguingly, vertebrate CNT2 (one of the SLC28 family members) has recently been localized on synaptic vesicles in the rat brain (Melani et al., 2012).

3.5.4. Other biogenic amines

Epinephrine and Norepinephrine. Epinephrine (adrenaline) and norepinephrine (noradrenaline) are generally thought to be restricted to deuterostomes and indeed they cannot be detected biochemically in C. elegans (Sulston et al., 1975). The biogenic amines octopamine and tyramine, which exist as trace amines in vertebrates, are generally thought to be the invertebrate “analogs” of norepinephrine and epinephrine (Roeder, 1999; Roeder, 2005). This is because of their related structures (Figure 7), but also because of striking similarities in their principle physiological roles (as discussed in detail for octopamine and norepinephrine in Roeder (1999)). However, on an enzymatic pathway level, the absence of epinephrine and norepinephrine cannot be predicted. The enzyme tyramine-β−hydroxylase (TBH-1), which generates octopamine in C. elegans, is closely related to dopamine-β−hydroxylase, which generates norepinephrine (Figure 7). There are three paralogous genes in the C.elegans genome (anmt-1 through anmt-3) (Table 14) that display similarity to phenylethanolamine N-methyltransferase (PNMT), which generates epinephrine. These genes are, however, equally related to indolethylamine N-methyltransferase (INMT) and nicotinamide N-methyltransferase (NNMT), which modify xenobiotic compounds.

Histamine. Histamine is biochemically detectable in worm extracts (Pertel and Wilson, 1974) but no specific role has yet been ascribed to histamine. Biosynthetically, histamine generation requires a histidine decarboxylase (HDC), a member of the family of aromatic amino acid decarboxylases (AAADs) (Figure 7). There is no obvious worm ortholog of vertebrate or fly HDC, but the worm genome contains five AAADs (Hare and Loer, 2004) (Table 14). One is the tyramine-producing enzyme TDC-1, another is the dopamine- and serotonin- producing enzyme BAS-1, both mentioned above. There is a close bas-1 paralog that misses residues for AAAD function (basl-1). Another uncharacterized AAAD (hdl-1) displays no specific affinity to any subtype and the last one (hdl-2) is somewhat more distantly related to the other AAADs. Vesicular transport of histamine can occur via the biogenic amine vesicular transporter cat-1 (Duerr et al., 1999). However, there are no obvious homologs of metabotropic histamine receptors in the genome of worms (or flies) (Roeder, 2003) and there are also no obvious orthologs to histamine-gated ion channels. However, GABAA channels have recently been shown to be modulated by histamine in vertebrates (Saras et al., 2008) and there are numerous GABAA-like channels in the worm genome. Arguing against the presence of histamine as a neurotransmitter in C. elegans is the lack of the two major enzymes involved in histamine breakdown, histaminase (diamine oxidase) and histamine methyltransferase. There are also no worm homologs of the Drosophila tan and ebony genes which generate histamine via an alternative pathway.

Melatonin. Even though melatonin is produced in C. elegans (Migliori et al., 2011) (Figure 7), its role as a neuromodulator in C. elegans is not proven. Arguing for a neuronal role is that the knockout of the HOMT enzyme produces locomotory defects (Tanaka et al., 2007), arguing against it is the absence of obvious homologs of the GPCR-type melatonin receptors MT1 or MT2 in the C. elegans genome (Tanaka et al., 2007). Best BLAST hits against vertebrate MT1/2 receptors are several npr-type putative neuropeptide receptors. Similarities between MT1/2 receptors and NPY receptors have been noted (Metpally and Sowdhamini, 2005). Melatonin can also signal via nuclear hormone receptors in mammals (Becker-Andre et al., 1994).

Other trace amines. As mentioned above, tyramine is generated by decarboxylation of the aromatic amino acid tyrosine by the AAAD enzyme TDC-1. In vertebrates, the decarboxylation of other aromatic amino acids (phenylalanine and tryptophan) by AAAD generates phenylethylamine and tryptamine, respectively (Figure 7). Both are called “trace amines” because they are found in only very small amounts in the vertebrate CNS, but they do have effects when administered to the brain (Webster, 2001). Since bas-1 and tdc-1 expression has already been assigned to specific aminergic neurons (i.e., serotonergic, dopaminergic, tyraminergic and octopaminergic), perhaps the above-mentioned AAAD hdl-1 and/or hdl-2 are involved in creating these trace amines. cat-1 could be involved in synaptic vesicle uptake and the above-mentioned orphan SLC6 family members in synaptic reuptake. Since trace amines are thought to be signaling through GPCRs, some of the many orphan GPCRs in the worm genome (some of which have homology to trace amine receptors, e.g., srsx-22 or srsx-25, BLAST score ~1e-09) could serve as receptors.

Synephrine, a metabolite of octopamine, is another trace amine that may exist in C. elegans. For this to be the case, one of the three above-mentioned possible homologs of PNMT (anmt-1 through anmt-3) would need to be able to use octopamine as substrate.

3.5.5. Conclusion

Taken together, the points raised above do not make definitive arguments for or against the existence of the additional neurotransmitter/neuromodulator systems in C. elegans. The best candidates for as-yet-unexplored neuromodulator systems are melatonin and adenosine. The substantial number of uncharacterized orphan vesicular transporter and reuptake transporters, as well as the impressive number of orphan ligand-gated channels (lgc genes mentioned above) and GPCRs, strongly suggest that as-yet-uncharacterized neurotransmitter/neuromodulator systems exist in the worm.

4. Neuropeptides

4.1. Neuropeptide-encoding genes

At present 122 neuropeptide genes encoding over 250 distinct neuropeptides have been identified in the C. elegans genome (Li and Kim, 2010). Of these, 40 genes encode insulin-like peptides (ins genes), 31 genes encode FMRFamide-related peptides (flp genes), and 51 genes encode non-insulin, non-FMRFamide-related neuropeptides (most encoded by the so-called nlp genes) (Li and Kim, 2010) (Table 15). Some of the nlp genes show similarities to neuropeptides of other species (Nathoo et al., 2001). There are likely more neuropeptides awaiting discovery, as evidenced by the slow trickle of newly discovered neuropeptides that initial analyses did not uncover. For example, recent proteomic analysis has identified a PDF peptide homolog (Janssen et al., 2009), genetic screens for behavioral mutants identified a previously uncharacterized neuropeptide, snet-1 (Yamada et al., 2010), and recent searches for C. elegans homologs of neuropeptides for which receptors are predicted in the worm genome (see Section 5.2 below) revealed a putative oxytocin homolog, ntc-1 (Beets et al., 2012; Garrison et al., 2012).

Expression patterns have been analyzed for more than 60 of the neuropeptide-encoding genes using reporter gene fusions. As summarized by Li and Kim, these studies show that all ins and flp genes examined and most of the nlp genes are expressed in the nervous system, with each gene having a unique expression pattern (Li and Kim, 2010) (Table 15). So far, more than half the cells in the nervous system express at least one neuropeptide gene but this number may increase because genes involved in neuropeptide release (e.g., unc-31) appear to be very broadly if not ubiquitously expressed in the nervous system. Furthermore, many individual neurons clearly express multiple neuropeptide genes.

4.2. Biosynthesis and processing of neuropeptides

Neuropeptides are produced as proproteins that contain several individual neuropeptides. These precursors are first cleaved by proprotein convertases, then the C-terminal basic residue is cleaved by a carboxypeptidase (Li and Kim, 2010). Afterwards most neuropeptides become further modified through amidation. There are four Kex2/subtilisin-like proprotein convertases (PCs) in the C. elegans genome (Table 16), one uncharacterized ortholog of a chaperone for the protease (sbt-1) and one carboxypeptidase E (egl-21) which has been explicitly linked to neuropeptidergic signaling (Li and Kim, 2010). In addition, the genome contains two, as yet uncharacterized homologs of the carboxypeptidase D family (cpd-1 and cpd-2), which are functionally related to the E family (Dong et al., 1999). These genes are the only representatives of the neuropeptide-processing carboxypeptidases in C. elegans, formerly called “regulatory carboxypeptidases” and now called M14B subfamily of metallocarboxypeptidases (TF315592) (see Merops database). C. elegans also contains a number of additional M14A-type carboxypeptidases (Merops database) but, in light of the function of their orthologues, they all likely act in digestion.

While egl-3 and egl-21 are well characterized in terms of function and expression (Kass et al., 2001; Jacob and Kaplan, 2003), much less is known about the amidation process. Studies in other organisms have shown that amidation involves the modification of a peptide-glycine processing intermediate by two enzymatic activities: a mono-oxygenase (called PHM) that associates with copper and molecular oxygen to hydroxylate glycine, and an enzyme (PAL) which cleaves the hydroxyglycine moiety to produce the peptide-amide (Prigge et al., 2000). In humans, both enzymatic activities reside in a single protein called PAM. In flies these two activities have split into two proteins (PHM and PAL). Curiously, the C. elegans genome contains genes for both versions: there is an as-yet-uncharacterized homolog of the multi-enzyme PAM protein encoded by pamn-1, as well as PHM and PAL orthologs, encoded by pghm-1 and pgal-1, respectively (Table 16). pamn-1 is expressed in the nervous system; the other genes have not been studied yet.

Analogous to classic neurotransmitter removal by reuptake or degradation, the activity of neuropeptides is also terminated through specific mechanisms. No reuptake mechanisms have yet been discovered for neuropeptides, rather signal-termination mechanisms work through the degradation of the neuropeptide by specific proteases. A number of different protease families have been implicated in this signal termination process, with the neprilysins being the best studied (Turner et al., 2001). Neprilysins are zinc metallopeptidases present on the outer surface of cells. The neprilysin family has significantly expanded in C. elegans, containing 27 members (Table 16), while there are only five neprilysin-like proteases in mammals. Two of the nep genes have been implicated in the execution of several distinct C. elegans behaviors (Spanier et al., 2005; Yamada et al., 2010).

Other proteases implicated in neuropeptide degradation (Isaac et al., 2009) are class III and class IV dipeptidyl peptidases (of which there are eight encoded in the C. elegans genome), the angiotensin-converting enzyme family (of which there is only one C. elegans homolog encoded by acn-1; the protein product is, however, predicted to be catalytically inactive), and tripeptidyl peptidase II (one C.elegans gene, tpp-2) (Table 16).

4.3. Neuropeptide receptors

The most prevalent class of neuropeptides receptors are 7TM G-protein-coupled receptors (GPCRs). These types of receptors and their relationship to neuropeptide signaling will be discussed in the ensuing GPCR section (Section 5). However, GPCRs are not the only neuropeptide receptors, as exemplified by the neuromodulatory insulin peptides. The insulin-like peptide encoded by ins-1 acts in a neuromodulatory manner to affect the local activity in a chemosensory circuit (Tomioka et al., 2006). As mentioned above, there are 40 insulin-encoding genes, and all genes examined so far are expressed in restricted domains within the nervous system, suggesting neuronal functions for many other insulins as well. DAF-2 is the only insulin/IGF receptor-like tyrosine kinase in the C. elegans genome and several insulin-like peptides, such as ins-1 are known to signal through DAF-2 (Tomioka et al., 2006). Intriguingly, the C. elegans genome also contains an unusual types of insulin receptor-related proteins (Dlakic, 2002), encoded by the irld genes (for insulin/EGF receptor L domain). Unlike DAF-2, these proteins do not contain tyrosine kinase domains but only contain extracellular Cys-rich L domains (related to LRR domains), found in insulin and EGF receptors (IPR000494). Like the insulin-encoding ins genes, genes for these type of putative insulin binding proteins have vastly expanded in the worm genome (69 genes) (Table 17). Many of these genes are recent duplicates. One third of these proteins contain transmembrane domains or membrane anchors and may be part of atypical insulin receptor complexes. L domain-only proteins cannot be found in Drosophila or vertebrate genomes (Dlakic, 2002).

Other types of neuropeptide receptors exist. The currents of ENaC/DEG/ASIC channels from snails and vertebrates are modulated by FMRFamide and related neuropeptides (Askwith et al., 2000; Jeziorski et al., 2000). Whether some C. elegans ENaC/DEG/ASIC-channels are gated by FMRFamides is not known. In any case, it should be kept in mind that non-GPCRs may act as receptors for other neuropeptides as well.

5. G-protein coupled receptors (GPCRs)

C. elegans contains more than 1,300 predicted GPCRs (see The putative chemoreceptor families of C. elegans). The human genome is thought to code for around 800 GPCRs (Lagerstrom and Schioth, 2008). Detailed bioinformatic analysis has organized GPCRs into five classes (Lagerstrom and Schioth, 2008) (Table 18). The Rhodopsin class (formerly class A) is by far the largest and contains a diverse set of members, including aminergic, peptidergic and olfactory receptors. The Secretin class (formerly class B) is much smaller and contains various peptidergic receptors related to the founding member of the family, the secretin receptor. The Adhesion class (formerly also part of class B) contains GPCRs with largely expanded N-terminal regions that are involved in cell adhesion; they usually also contain a “GPS domain” (for GPCR proteolytic site”) that is required for cleavage of the N-terminal extracellular domain. The Glutamate receptor class (formerly class C) contains metabotropic glutamate and GABA receptors. The Frizzled/Taste2 class contains Wnt receptors and a newly discovered class of taste receptors. As shown in Table 18, C. elegans contains members of each class. Rather than discussing them according to this classification system, I will discuss them by function and type of ligand.

5.1. Metabotropic neurotransmitter receptors

The C. elegans genome contains three GPCRs for ACh (gar-1, gar-2, gar-3), two for GABA (gbb-1, gbb-2), and three for glutamate (mgl-1, mgl-2, mgl-3). These receptors can be clearly identified based on sequence features (Table 19). The metabotropic glutamate receptors mgl-1, mgl-2 and mgl-3 fall into the three ancestral groups of mGluRs (Kuang et al., 2006). In addition, there are two more, as-yet-uncharacterized GPCRs in the genome, C30A5.10 and F35H10.10, that show significant similarities to mGluRs and display an annotated “Ligand-binding domain of metabotropic glutamate receptor” annotated by the CCD domain database. F35H10.10 contains signature motifs of the GPCR family 3 (also called family C, IPR000337).

The ACh GPCRs fall into the into the Rhodopsin class of GPCRs (like most other GPCRs), while the GABA and Glu GPCRs fall into the Glutamate receptor class (class C) of GPCRs (IPR000337). Aside from the above-mentioned genes there are no other Class C family members in the worm genome.

C. elegans contains 16 GPCRs for monoaminergic neurotransmitters (Table 19). These receptors fall into the Rhodopsin or “Class A” family of GPCRs. Four biogenic amines have been identified as neurotransmitters in C. elegans so far: serotonin, dopamine, tyramine and octopamine (as mentioned above, there may be more). Several but not all of these receptors can be assigned to specific ligands by sequence alone. Biochemical and functional analysis has assigned most of the receptors to specific ligands as shown in Table 19. Reporter-based expression patterns exist for all 16 families members, revealing their expression in a restricted set of neurons. Several of the receptors are also expressed in muscle cells. In addition to the 16 genes encoding aminergic GPCRs, there is also another gene, C24A8.6, that likely arose through a local inversion of the neighboring dop-6 gene. Parts of this gene are identical to dop-6, but as the identity is limited to one restricted part of the protein, C24A8.6 is unlikely to form a functional GPCR. More information on biogenic amines and their receptors can be found in Biogenic amine neurotransmitters in C. elegans.

As mentioned above, even though it is not clear whether adenosine can generally be classified as a neurotransmitter in any system, a clear ortholog of a GPCR-type adenosine receptor, ador-1, is encoded in the worm genome but has not yet been characterized in terms of expression or function.

5.2. Neuropeptide receptors

Apart from some known exceptions described in Section 4.3, neuropeptides generally signal through GPCRs. Neuropeptide GPCRs are mostly of the Rhodopsin class, but also the Secretin class (Table 18). The ability to predict neurotransmitter receptors by sequence (i.e., to distinguish them from other types of GPCRs, such as olfactory GPCRs) is not as straight-forward as it is for metabotropic receptors of classic neurotransmitters. Nevertheless, some basic sequence features are conserved enough in neuropeptide receptors to assess the number of neuropeptide receptors in the worm genome as was done in several previous studies (Keating et al., 2003; Wenick and Hobert, 2004; Janssen et al., 2010). By revisiting these previous datasets and supplementing them by additional analysis, a list of 153 genes can be assembled whose products share significant homology to known neuropeptide receptors (Table 20, see legend for methodology). This list includes the 14 C. elegans neuropeptides receptors that have been biochemically deorphanized as FLP or NLP receptors (Li and Kim, 2010). The cutoff for significance was set to BLAST scores of 1e-04 to exclude known non-neuropeptide receptor GPCRs (three deorphanized true sensory receptors, odr-10, srbc-64, and srg-37, and one likely sensory receptor, str-2, were used to set this threshold as they all have e values larger than 1e-04).

About one quarter of the 153 putative neuropetide receptors stand out in their obvious relation to vertebrate neuropeptide receptor families, such as the neuropeptide Y receptor family (at least 12 members in C.elegans), the neuromedin receptor family (at least 6 C. elegans genes), the neurokinin/neuropeptide FF/orexin receptor family (6 genes), the somatostatin family (6 genes), the galannin receptor family (4 genes), and the gonadotropin-relasing hormone receptor family (8 genes) (Cardoso et al., 2012) (Table 20). Most of the remaining C. elegans neuropeptide receptors cluster into related and likely paraologous families, some of them quite large. For example, one family contains 20 genes, termed frpr genes, and clusters with the Drosophila FMRFamide receptor FR (TF316702). Two of the family members have been shown biochemically to be FLP receptors (Li and Kim, 2010). However, in spite of its similarity with other members of this family, one family member (daf-37) was recently reported to be a sensory receptor for ascarosides, glycolipids that worms use to communicate with one another (Park et al., 2012). Another family with 17 genes, the dmsr family, clusters with Drosophila dromyosuppressin receptor Dms-R2 (TF315509), which serves as a receptor for a FMRFamide peptide (Klose et al., 2010). Indeed one of the worm family members, EGL-6, has been shown to be a receptor for FLP-10 and FLP-17 (Ringstad and Horvitz, 2008). Members of another family (TF315321) contain low similarity to fly FMRFamide receptor, but the degree of similarity is consistent across most family members. Many of the remaining putative neuropeptide receptors also fall into smaller families and two of them are known to be activated by NLP peptides (Table 20). Some of them are obvious sequence orthologs to fly or vertebrate receptors (Table 20). Three of them are secretin-type (formerly class B-type) neurotransmitter receptors, pdfr-1, seb-2 and seb-3 (for secretin/class B-type receptor). A recent analysis of GPCRs in the C. elegans genome comes to a similar conclusion (Frooninckx et al., 2012).

Almost all of the >20 candidate neuropeptide receptor-encoding genes for which an expression pattern has been examined show expression in a restricted number of neurons (Table 20). Given that many neuronally expressed peptides modulate transmission at neuromuscular junctions or contraction of other organ types (e.g., the gut (Nichols et al., 2002)) it is likely that many of the receptors will also be expressed in muscle or other non-neuronal cell types.

However, there are possibly many more than the 153 neuropeptide receptors described above. As Robertson and Thomas note in The putative chemoreceptor families of C. elegans, the divergent srw gene family, which consists of 100 members, is related to neuropeptide receptors. A BLAST analysis of representative genes from individual srw subfamily branches illustrates this notion (Table 21). Since almost 90% of srw genes reside in large clusters on the arms of chromosome V, they likely represent a nematode-specific expansion. Whether they serve to monitor internal signals or serve as receptors for environmental peptides, as suggested by Robertson and Thomas, remains to be seen. In addition, several members of the srsx family of GPCRs (37 genes) show significant BLAST hits to neuropeptide receptors. Any of those GPCRs could of course also be ligands for other chemical substances such as lipids or other small organic molecules. That this is more than just a mere possibility is illustrated by daf-37 and daf-38, two GPCRs recently reported to be receptors for ascarosides (as mentioned above, these are glycolipids that worms use to communicate with one another) (Park et al., 2012). Both daf-37 and daf-38 display significant homology to neuropeptide receptors (Table 21), illustrating the diversity and unpredictability of ligands for GPCRs.

5.3. Sensory and orphan receptors

Robertson and Thomas identified ~1,280 likely chemoreceptor genes in the genome, which they classify into several groups in Table 1 of The putative chemoreceptor families of C. elegans. The evidence for the function of these GPCRs as chemoreceptors is very strong: First, for the many dozen genes for which expression profiles have been examined, expression is clearly observed in sensory neurons (Troemel et al., 1995; Colosimo et al., 2004; Chen et al., 2005). Second, several of the receptors are localized to the sensory dendritic endings of chemosensory neurons where the chemosensory apparatus is thought to be concentrated (Dwyer et al., 1998; Colosimo et al., 2004). Third, since the discovery of odr-10 as a receptor for the odorant diacetyl (Sengupta et al., 1996; Zhang et al., 1997), four additional genes (srbc-64, srbc-66, srg-36, srg-37) have been shown to code for receptors for ascarosides (Kim et al., 2009; McGrath et al., 2011). Fourth, as pointed out by Robertson and Thomas (The putative chemoreceptor families of C. elegans), it is hard to imagine what else other than sensory receptors could be encoded by such a large gene family.

The sensory modalities of the sensory GPCRs may be diverse. Apart from small chemicals (e.g., diacetyl for ODR-10), some of the GPCRs may also be light detectors (Edwards et al., 2008; Liu et al., 2010). Sensory modalities are difficult to predict, for example, the UV receptor lite-1 is a member of a small subfamily of worm GPCRs (five gur genes) that were initially recognized based on their similarity to Drosophila gustatory receptors (Robertson et al., 2003).

It is noteworthy that a substantial number of the chemosensory GPCR family members are not only expressed in sensory neurons, but also in distinct sets of interneurons and motorneurons (a few samples are shown in The putative chemoreceptor families of C. elegans). In fact, some are only expressed in interneurons, e.g., sra-11 (Troemel et al., 1995). Since more than one third of so-far-analyzed chemosensory GPCRs are expressed in non-sensory neurons (Troemel et al., 1995; Chen et al., 2005), it is to be expected that hundreds of the total of ~1,280 receptors may be expressed in many different parts of the nervous system. These genes may code for receptors that monitor neuronal or non-neuronally-derived internal signals. These signals could range in their composition from peptides to small organic substances to lipids, all of which are recognized by GPCRs in other systems. These signals may be highly species-specific. For example, there are no worm homologs for a good number of vertebrate GPCRs known to sense different types of lipids (Kostenis, 2004). Additionally, there are no homologs of cannabinoid GPCRs (key enzymes in endocannabinoid synthesis are also missing), no homologs of the LPA receptor, and no homologs of the FFA (free fatty acid receptor) GPCRs. Worms may use different types of lipids for signaling.

5.4. Adhesion GPCRs

Together with a small number of secretin-type hormone receptors (three in C. elegans), this family used to be part of the class B family of GPCRs, but is now recognized as its own family (Lagerstrom and Schioth, 2008) (Table 18). The signature features of adhesion GPCRs are (1) an extended N-terminus that contains distinct sets of domains involved in protein-protein interactions (hence their presumed role in cell adhesion), (2) a signature GPS domain involved in autoproteolytic cleavage of the N-terminal extracellular domain, and (3) the 7TM part which is related to the secretin/hormone-type GPCRs.

While humans contain diverse set of proteins in the adhesion group, each with distinct extracellular domains (33 in total) (Lagerstrom and Schioth, 2008) and diverse function (Yona et al., 2008), C. elegans only contains five members of this group (Table 22). Three of them are obvious orthologs of well-characterized fly and human proteins: fmi-1 (Flamingo ortholog), lat-1, and lat-2 (latrophilin orthologs). fmi-1 function in neuronal development and synapse formation (Steimel et al., 2010; Najarro et al., 2012) and lat-1 has a role in synaptic transmission (Willson et al., 2004). C. elegans contains single homologs of proteins that Latrophilins are thought to interact with, neurexin (nrx-1 in worms) and teneurin-1 (ten-1 in worms), both with presumed functions in synapse formation.

Methuselah is the founding member of a GPCR subfamily that expanded specifically in Drosophila and is considered a third subgroup of the former “class B” class of GPCRs (Harmar, 2001). Methuselah has documented functions in synaptic transmission (Song et al., 2002) and can bind various neuropeptides (Ja et al., 2009). Based on similarity identified in the Panther database (PTHR12011), there are two genes in C. elegans related to Methuselah's, mth-1 and mth-2. In contrast to fly Methuselah proteins, the two worm proteins contain the signature GPS domain which clearly places them in the adhesion group of class B GPCRs. The function or expression of mth-1 and mth-2 have not yet been examined.

5.5. Frizzled/Taste2 GPCRs

There are no Taste2 GPCRs in the C. elegans genome, but there are four Frizzled-type GPCRs (Table 18), which are receptors for Wingless-type signaling molecules (of which there are five in the C. elegans genome). In both C. elegans and other organisms, there is abundant evidence for the role of Frizzled signaling in neuronal development (for a review of the C. elegans work, see Wnt signaling. In flies and vertebrate, there is evidence for a role of Frizzled in synapse function, particularly in the form of anterograde and retrograde feedback signals at the synapse (Speese and Budnik, 2007). In worms, the evidence for adult neuronal functions of Frizzleds is the dependence of AMPA receptor localization on a Frizzled effector gene, β-catenin (Dreier et al., 2005) and the observation that reporters for the two Frizzled receptors mom-5 and cfz-2 are expressed in a subset adult neurons. The expression or function of other Frizzleds in the adult nervous system has not yet been reported.

5.6. Downstream of GPCRs

GPCRs usually signal through heteromeric G-protein complexes by acting as nucleotide exchange factors (signaling downstream of Frizzled GPCRs tends to be more complex (Speese and Budnik, 2007)). There are 21 Gα–, two Gβ–, and two Gγ-encoding genes in the C. elegans genome (see Heterotrimeric G proteins in C. elegans), compared to 21, 5 and 19 genes in humans (Table 23). Three Gα-(gsa-1, egl-30, goa-1), both Gβ and one Gγ-encoding gene (gpc-2) are very broadly expressed, most of the remaining 18 Gα and one Gγ-encoding gene (gpc-1) are expressed in a highly neuron-type specific manner, most of them in sensory neurons (see Heterotrimeric G proteins in C. elegans).

The activity of GPCRs is controlled by a number of regulatory factors, including G protein coupled receptor kinases (GRKs), of which there are two in the worm genome (grk-1 and grk-2) (Table 23), and arrestin. Arrestins regulate the inactivation, internalization and trafficking of GPCRs (Gurevich and Gurevich, 2006). Mammalian genomes encode four conventional arrestins, which are either specifically expressed in visual systems (vertebrate cone arrestin and rod arrestin) or are broadly expressed (arrestins 2 and 3). There is one conventional arrestin in the C. elegans genome, called arr-1 (Table 23).

Intriguingly, a functionally uncharacterized family of arrestin-related proteins (now called α-arrestins) has been recognized in genomic sequences across phylogeny (Alvarez, 2008). Their overall sequence homology to classic arrestin is relatively low, but their predicted overall structure appears highly related. Classic arrestin is composed of two modules with antiparallel β-sheets (Arrestin-N and Arrestin-C domain) which are similar to an Fn3 module (Aubry et al., 2009). The arrestin-related proteins (called ADCs or ARRDCs for “Arrestin domain containing”) share this structure (Aubry et al., 2009). There are five of these ARRDC proteins in humans, but C. elegans has vastly expanded its repertoire of this type of protein—it contains 31 arrd genes coding for ARRDC proteins. As listed in Table 23, expression of the three arrd genes whose expression has been analyzed to date was detected in subsets of neurons. It is intriguing to think about this family expansion in the context of expansion of the worm repertoire of GPCRs, as well as G proteins (above). The limited number of neurons in the worm means that individual neurons likely express dozens of GPCRs. To distinguish between activation of different GPCRs (and to therefore permit discrimination between distinct inputs), GPCR subfamilies may be able to hook up with distinct downstream signaling molecules to produce distinct signaling outputs.

GPCR function is also regulated by recently discovered transmembrane proteins of the RAMP family and by the GASP proteins (Magalhaes et al., 2012); there are no obvious homologs of either type of protein encoded in the worm genome.

Heterotrimeric G-protein signaling is controlled by various regulatory factors, including guanine nucleotide exchange factors (GEFs) and guanine nucleotide dissociation inhibitors (GDIs) proteins. Apart from the GPCRs themselves, there is at least one other dedicated GEF encoded in the worm genome, ric-8, and there are three GDI proteins each of which contain “GoLoco” domains (Table 23). A large family of G-protein regulators are encoded by the rgs genes (“regulator of G protein signaling”) (Porter and Koelle, 2009). There are 21 members of this family encoded in the worm genome (compared to 38 in humans), many with either broad or cell-type specific expression in the nervous system and specific roles in nervous system function (Porter and Koelle, 2009) (Table 23).

6. Cyclic GMP

6.1. Guanylyl cyclases

Guanylyl cyclases generate a second messenger molecule, cGMP, which is predominantly used in the nervous system of C. elegans, as assessed by the neuron-type specific expression of most guanylyl cyclases (Table 24). Generally, guanylyl cyclases exist either as soluble, cytoplasmic versions (sGCs) or in single pass transmembrane versions with large extracellular domains (rGCs) (Potter, 2011). There are 27 gcy genes in the C. elegans genome coding for rGCs and seven gcy genes coding for sGCs (Table 24), compared to five and four genes in human, respectively (Fitzpatrick et al., 2006; Ortiz et al., 2006).

Members of both families are thought to be receptors for small ligands In C. elegans. Molecular oxygen is the ligand for sGCs, with different sGCs being tuned to detect different oxygen concentrations (Gray et al., 2004; Zimmer et al., 2009). sGCs are exclusively expressed in sensory neurons (all likely oxygen-sensory neurons).

rGCs are expressed in many sensory neuron (where several of them localize to sensory dendrites), but are also expressed in non-sensory neurons. Specific functions have been identified in non-sensory neurons (Shinkai et al., 2011), suggesting that rGC function is not restricted to sensing environmental cues. No specific ligands have yet been identified for any C. elegans rGC proteins, but there are many diverse candidates: (1) small peptides, in analogy to the ligands of mammalian GCY proteins (Potter, 2011); (2) salt ions, based on their role sensory taste perception (Ortiz et al., 2009); and (3) other small organic molecules (based on the presence of the “extracellular ligand-binding receptor domain”, IPR001828, which can also be found in metazoan glutamate receptors and bacterial amino acid transporters).

rGCs may also be activated by CO2 (Hallem et al., 2011; Brandt et al., 2012). In contrast to the other putative ligands of rGCs which likely act through the extracellular domain, CO2 sensing may work through the intracellular catalytic domain, in analogy to vertebrates rGCs (Potter, 2011). This process may entail the conversion of CO2 into bicarbonate, catalyzed by carbonic anhydrases (see below, Section 7).

GCAPs (guanylate cyclase-activating proteins) and GCIPs (guanylate cyclase-inhibitory proteins) are calcium-binding, EF hand proteins that control GCY activity in the phototransduction pathway in the vertebrate retina (Palczewski et al., 2004). The closest relatives to the GCAP and GCIP proteins in the C. elegans genome are the NCS proteins NCS-1, NCS-2 and NCS-3 (Table 6). The expression of ncs-1 in many sensory neurons matches the expression of receptor-type gcy genes.

Aside from cGMP-gated ion channels (CNG) discussed above, other critical neuronal effectors of cGMP are cGMP-dependent protein kinases, of which there are two encoded in the worm genome: egl-4, which is expressed in the nervous system and has various nervous-system-associated functions (see Chemosensation in C. elegans); and pkg-2, which is presently uncharacterized.

6.2. Phosphodiesterases

cGMP levels are controlled by phosphodiesterases. The C. elegans genome encodes six of these enzymes (pde-1 through pde-6), each of which has a closely-related human ortholog (Conti and Beavo, 2007) (Table 24). Based on this orthology, pde-4 and pde-6 are cAMP-specific while pde-1, pde-2, pde-3 and pde-5 control cGMP levels. pde-1, pde-2 and pde-3 are expressed in the nervous system; the expression of other genes has not been investigated. Six pde genes do not reflect a nematode expansion as there are more than 20 human pde genes; the clear one-to-one orthology observed for human and worm PDEs (Conti and Beavo, 2007) is rarely seen in other gene families analyzed in this paper.

7. Receptors for CO2 and O2

Like vertebrate rGCs, C. elegans rGCs are activated by CO2 (Hallem et al., 2011; Brandt et al., 2012). This process may entail the conversion of CO2 into bicarbonate, catalyzed by carbonic anhydrases. Expression of carbonic anhydrase is generally considered to be a hallmark of CO2 responsive neurons (Bretscher et al., 2011). The C. elegans genome encodes eight predicted carbonic anhydrases, six of the α family (cah-1 through cah-6) and two of the mitochondrial β family (bca-1 and bca-2) (Table 25). All six cah genes are expressed in restricted patterns in the nervous system (Bretscher et al., 2011).

The above-mentioned sGC proteins are not the only oxygen sensors. Globin domain proteins are heme proteins important for oxygen transport, storage and sensing (Weber and Vinogradov, 2001). The globin glb-5 has been implicated specifically in neuronal oxygen sensing (McGrath et al., 2009; Persson et al., 2009). With 33 members, the globin family has dramatically expanded in nematodes (Hoogewijs et al., 2008; Tilleman et al., 2011). All but three of the genes are expressed in restricted patterns in the nervous system (Hoogewijs et al., 2008), suggesting neuron-type specific functions (Table 25). These proteins are not likely to simply act as buffers or sinks but are rather thought to be involved in signaling events, triggered by oxygen binding (Persson et al., 2009). Why so many genes are needed for the seemingly simple task of oxygen binding is mysterious but likely relates to the fact that C. elegans navigates through environments with highly variable ambient oxygen concentrations. For example, oxygen concentrations in the soil vary from 1%-21%, depending on depth from the surface as well as soil properties such as compaction, aeration, and drainage (Anderson and Ultsch, 1987).

8. Presynaptic machinery

The machinery required for synaptic transmission is composed of many highly conserved proteins. Some components of the core SV fusion machinery were discovered by genetic studies of C. elegans neuronal function (see Synaptic function).

A SNARE complex mediates synaptic vesicle fusion. The core SNARE complex at the synapse is a ternary complex that is composed of two plasma membrane proteins (tSNAREs), syntaxin and SNAP25, plus a vesicle associated protein of the synaptobrevin/VAMP family (vSNARE) (Wang and Tang, 2006; Parpura and Mohideen, 2008). Based on the presence of conserved Q and R amino acids, SNARE proteins have also been classified as Q-SNAREs and R-SNAREs. Q-SNARES usually function as plasma membrane tSNAREs, and R-SNAREs function as vesicle-associated vSNAREs. In addition, SNAREs require an “SM” (Sec1/Munc18) protein to function, which is UNC-18 at the C. elegans synapses.

A number of SNARE proteins are not directly involved in neurotransmitter exocytosis, but do show selective expression in specific domains of the nervous system and have distinct neuron-specific functions, including neurotransmitter receptor trafficking, neuronal morphogenesis and neurite outgrowth (Wang and Tang, 2006).

In C. elegans the vesicle-associated vSNARE synaptobrevin is encoded by snb-1, and the plasma membrane-associated tSNAREs syntaxin and SNAP-25 are encoded by unc-64 and ric-4/snap-25, respectively. The C. elegans genome contains eight additional, mostly uncharacterized VAMP/synaptobrevin family members, including orthologs of Sec22 and Ykt6 and nine additional syntaxin family members (Table 26). Besides the canonical SNAP-25 ortholog, ric-4, there are two more SNAP-25 related genes, aex-4 and snap-29 and seven additional Q-SNAREs (Table 26).

A key feature of SNARE-mediated fusion at synapses is that it is calcium-dependent, and the calcium sensor synaptotagmin is required for synaptic vesicle fusion (Parpura and Mohideen, 2008). The calcium-sensing synaptotagmins have also expanded, with six additional snt genes besides the first characterized worm synaptotagmin snt-1 (Table 26). With >10 synaptobrevins, syntaxins and synaptotagmins each, mammals contain roughly the same number of these types of genes.

C. elegans contains single copies of the many integral components of synaptic vesicles and regulatory factors involved in the synaptic vesicle cycle, as shown in Table 26. The reader is referred to Synaptic funtion for more detail.

Dense core vesicles are thought to secrete neuropeptides and represent a class of vesicle distinct from small vesicles that carry fast acting, “classic” neurotransmitters. The C. elegans genome encodes single orthologs of several proteins involved in the dense core vesicle cycle, including unc-31/CAPS, and the IA2-related tyrosine phosphatase ida-1 (Li and Kim, 2010).

Genes that are involved in the synaptic vesicle cycle are expressed in most if not all neuron types. They are often not restricted to the nervous system and are expressed in other tissue types as well.

Apart from calcium, synaptic vesicle release is regulated by neuronal GPCR signaling (Perez-Mansilla and Nurrish, 2009). One critical node in integration of GPCR signals and synaptic vesicle release is the control of diacylglycerol (DAG) production, which in turns controls unc-13 activity. DAG levels are controlled by PLCβ (one clear ortholog in C. elegans, egl-8) and diacylglycerol kinases, of which there are five in the worm genome (dgk-1 through dgk-5). All of those for which expression has been analyzed (dgk-1, dgk-3, dgk-4) show neuronal expression and dgk-1 has been implicated in synaptic function (Perez-Mansilla and Nurrish, 2009).

9. Neurotransmitter receptor localization: PDZ proteins

Various types of scaffolding proteins organize neurotransmitter receptor localization in postsynaptic densities. The most prominent type of such scaffolding proteins contains PDZ domains (Feng and Zhang, 2009). The C. elegans genome contains 70 proteins with easily recognizable PDZ domains (Table 27); the human genome contains several hundreds. Several of the C. elegans PDZ domain proteins are orthologs of well-characterized PDZ proteins known to localize neurotransmitter receptors, while most of them have unknown functions. In C. elegans, the LIN-10 protein is known to be required for neurotransmitter receptor localization (Glodowski et al., 2005) and the multiple-PDZ protein encoded by mpz-1 colocalizes with the GPCR-type serotonin receptor SER-1 (Xiao et al., 2006). Other multidomain PDZ proteins may similarly be involved in neurotransmitter receptor clustering.

10. Gap junctions - the innexins

Gap junctions electrically couple cells. Within the C. elegans nervous system gap junctions form a giant neuronal network that can be broken down into subnetworks based on the number of gap junction contacts (Majewska and Yuste, 2001). Apart from their role in controlling circuit activity in the mature C. elegans nervous system (Starich et al., 2009; Kawano et al., 2011), gap junctions also have developmental roles during neuronal circuit formation (Chuang et al., 2007; Yeh et al., 2009).

Gap junctions in both vertebrates and invertebrates are formed by oligomerization of six homomeric or heteromeric subunits on each cell membrane. Hemichannels on either cell membrane connect in either a homotypic or heterotypic manner to form a gap junction. Recent work suggests that hemichannels alone may also have functions, possibly as some sort of “leak” channels that allow small molecules to leave the cell (Scemes et al., 2009).

In vertebrates, the gap junction subunits are called connexins (20 in humans) and pannexins (three in humans), while invertebrate gap junction subunits are called innexins (Starich et al., 2001; Scemes et al., 2009). All proteins have a similar topology of four transmembrane domains with cytoplasmic N- and C-terminal tails. Vertebrate connexins and invertebrate innexins share no notable primary sequence similarity, but vertebrate pannexins (3 genes) were actually identified based on their similarity to innexins (hence the term “pannexin”).

The C. elegans genome contains 25 innexin-encoding (inx) genes (Altun et al., 2009), about the same number as there are vertebrate connexin and pannexin genes. Based on a genome-wide reporter gene analysis, 20 out of the 25 inx genes are expressed in the nervous system (Table 28), many of them with striking cellular specificity (Altun et al., 2009). Expression of innexin genes—and hence gap junction circuitry—appears to be remarkably dynamic (Chuang et al., 2007; Altun et al., 2009).

Innexin function appears to be regulated by stomatins (Chen et al., 2007), which, as described above, also regulate other channels. As mentioned above, there are 10 stomatin-encoding genes in the worm genome (Table 3).

11. Motor proteins & their associated complexes

11.1. Kinesin, dynein and myosin motors

Besides several broad cellular roles, members of the kinesin, dynein, and myosin superfamily molecular motors have specific roles in neuronal function, plasticity, and morphogenesis (Hirokawa et al., 2010). Kinesins have been divided into specific subfamilies, several of which carry out essential functions in all dividing cells, while the most relevant for nervous system development are the subfamilies involved in synaptic vesicle transport and intraflagellar transport (see below and also The sensory cilia of Caenorhabditis elegans). 21 kinesin-like proteins are encoded in the C. elegans genome (Table 29), some characterized (e.g., unc-104, osm-3, klp-6), some completely uncharacterized. In addition, C. elegans contains three atypical kinesins, one with a well-documented function in axon pathfinding (vab-8).

Cytoplasmic dyneins consist of heavy chains, light intermediate chains, intermediate chains, and light chains (which fall into three further subfamilies). C. elegans contains representatives for each family, with some of the families being notably expanded (Table 30).

Lastly, the C. elegans genome contains 18 genes coding for proteins with myosin motor domains, including classic muscle myosin, but also several genes that are expressed in neurons (Table 31).

11.2. Motor complexes that build cilia of sensory neurons

Cilia are microtubule-containing organelles that emanate from the surfaces of most animal cells. There are two types of cilia: motile cilia used for locomotion or for the generation of fluid flow, and non-motile (primary) cilia, which are implicated in sensing the environment (see The sensory cilia of Caenorhabditis elegans). Unlike many organisms, including humans, sensory neurons are the only ciliated cell types in C. elegans. 60 of the 302 hermaphroditic neurons are sensory neurons that possess non-motile, primary cilia, many with a striking morphological diversity (http://www.wormatlas.org).

In order to build ciliated structures and transport specialized functional components (such as receptors and ion channels) into this structure, a sophisticated anterograde and retrograde transport machinery exists that traffics substrates from the so-called transition zone to the distal ends of the ciliated dendrites (The sensory cilia of Caenorhabditis elegans). Specific types of kinesins provide the anterograde motor activity, and specific dyneins the retrograde motor activity (Table 32). The intraflagellar transport complexes that contain the motor proteins contain a substantial number of additional proteins that fall into three separate modules, the IFT-A, IFT-B, and BBSome modules (amounting together to a total of 27 proteins). A “parts list” of these complexes was recently put together by Inglis et al. (Inglis et al., 2009) and is summarized in Table 32. Differential regulation of individual components of this core set of ciliary genes is thought to be at least in part responsible for building distinct types of cilia (Mukhopadhyay et al., 2007; Silverman and Leroux, 2009). More genes involved in building specific types of ciliated endings likely remain to be identified.

12. Neuronal recognition and adhesion molecules

A hallmark of nervous system architecture is the specificity of cellular contacts, either synaptic or adhesive. Proteins involved in synapse formation and maintenance are poorly characterized to date in any system. By contrast, a plethora of molecules are known to be involved in adhesive interactions in a mature nervous system and in cell recognition during development. The C. elegans genome contains a complex assembly of many distinct types of cell adhesion and extracellular matrix proteins. Among all these types of transmembrane or extracellular proteins, the classes of proteins with the most extensively documented function in the development and function of nervous systems of various species are the immunoglobulin superfamily members, the Leucine-Rich Repeat (LRR) proteins, cadherin family members, and neurexins and their various ligands. C. elegans has a number of representatives of each family.

12.1. Immunoglobulin superfamily

Members of the immunoglobulin superfamily are involved in axon pathfinding (e.g., unc-40/DCC, sax-3/Robo), synapse formation (syg-1, syg-2), neuronal axon and soma adhesion (sax-7), axonal maintenance (zig genes) and neurotransmitter receptor clustering (oig-4). Essentially all immunoglobulin superfamily members examined so far are expressed in a subset of neurons (Aurelio et al., 2002; Schwarz et al., 2009). The immunoglobulin superfamily of C. elegans has been described previously (Vogel et al., 2003; Hobert et al., 2004). Excluding intracellular Ig domain proteins (such as those encoded by unc-22, unc-89, dim-1, or unc-73), there are a total of 64 proteins that contain one to many copies of easily recognizable subtypes of Ig domains (Table 33). 18 of these proteins contain no other obvious protein domains, are not necessarily closely related to one another and are small secreted or transmembrane proteins of the zig and oig type. Many of the remaining proteins contain additional fibronectin-III domains, which are related to Ig domains, and some contain distinct complements of domains, some of which are implicated in cell adhesion, others in signaling (Table 33). In contrast to many other gene families discussed here, the Ig domain family has not expanded in worms, but has expanded in humans which have many hundreds of immunoglobulin superfamily members.

Notably absent from the C. elegans genome is a homolog of the immunoglobulin superfamily member DSCAM, a neuronal recognition protein with remarkable isoform diversity in flies (Hattori et al., 2008). There are also no obvious orthologs of mouse SynCAM proteins, although proteins with similar domain architecture exist (igcm-3, igcm-4). There is a worm homolog of vertebrate Sidekick (rig-4), implicated in synaptic targeting in vertebrates (Schwarz et al., 2009). Receptor tyrosine phosphatases (all containing extracellular Ig domains) have been implicated in various aspects of neuronal development (Paul and Lombroso, 2003). There are three of these genes in the C. elegans genome (Table 33), one of them the LAR ortholog ptp-3 in worms, which has been implicated in synapse maturation in several different species including worms (Stryker and Johnson, 2007).

12.2. Leucine-Rich Repeat (LRR) proteins

Members of the LRR family include the axon guidance cue Slit/slt-1 and several vertebrate proteins involved in synapse formation and function (de Wit et al., 2011). Extracellular LRR (eLRR) proteins (i.e., either secreted or transmembrane) in worms have been analyzed in silico (Dolan et al., 2007). 29 eLRR protein-encoding genes can be found in the worm genome (Table 33). Most are exclusively composed of LRR repeats, some also contain Ig domains. Some of the proteins are secreted, but most C. elegans eLRR proteins have transmembrane or GPI anchors. A number of them have highly conserved vertebrate orthologs, such as the guidance cue slt-1, the Toll-like receptor protein tol-1 or neuropeptide receptor fshr-1. Even though there are no obvious orthologs of the vertebrate LRR synaptic adhesion molecules (LRRTMs, SALMs and NGLs), there are C. elegans proteins with similar domain architectures (multiple LRR domains and a transmembrane domain, or multiple LRR and Ig domains) (Table 33).

Many of the eLRR family members have been analyzed for expression in the nervous system, revealing neuronal expression for many of them (Liu and Shen, 2011) (Table 34). Notably, in contrast to many other gene families discussed here, the LRR family has, like the Ig domain family, not expanded in worms. There are almost five times as many eLRR proteins in mouse and humans (Dolan et al., 2007).

12.3. Cadherins

The C. elegans genome contains 13 genes that code for proteins with cadherin domains, one more than previously noted (see The cadherin superfamily) (Table 34). This is significantly less than the more than 100 cadherin and cadherin-related genes in mammals. Representatives of several ancient cadherin subgroups can be found in C. elegans, including Flamingo, FAT and Dachous-type cadherins as well as more classic cadherins, yet there are also a number of nematode-specific cadherins, which are mostly uncharacterized (The cadherin superfamily) (Table 34).

There are no protocadherins encoded in the fly or worm genome (The cadherin superfamily). DSCAM in Drosophila and the protocadherins in mammals are the two classes of diversely spliced cell-cell recognition molecules in metazoan nervous systems (Zipursky and Sanes, 2010). The absence of both types of isoform-rich molecules in the worm genome may be a testament to the reduced morphological complexity of its nervous system.

12.4. Neurexin and its ligands

Vertebrate neurexins are synaptic proteins that interact with a set of distinct partners to help synapses mature and function appropriately. C. elegans contains a single neurexin gene, nrx-1, that is broadly expressed in the nervous system but not yet functionally characterized (Table 35). There is presently no evidence in transcriptome datasets that the nrx-1 locus produces anything close to the tremendous amount of alternatively spliced isoforms that are characteristic of vertebrate neurexins (Missler and Sudhof, 1998). Neurexin-like genes of the CASPR family can also be found in the worm genome (Haklai-Topper et al., 2011) (Table 35).

The C. elegans genome contains orthologs of several neurexin binding proteins, including neuroligin (nlg-1), a neuroligin-related protein (glit-1) and two latrophilin genes (lat-1 and lat-2). There are no obvious orthologs of LRRTM1/2, synaptic adhesion molecules that interact with Neurexin, but proteins with similar domain architectures can be found, as mentioned above in the LRR section (Table 34). Clear orthologs of other neurexin binding partners (LRRTM1/2, Cbln1, Neurexophilins) cannot be found in the worm genome (Table 35).

13. Conclusions

This compendium covers ~2,800 genes with predicted neuronal functions. Yet it likely only scratches the surface of neuron-type specific gene batteries since there are already and will be many more genes which show neuron-type specific expression patterns in the mature nervous system. In fact, any given neuron type is expected to express several thousand genes and the comparison of even closely related individual neuron types revealed more than 1,200 differentially expressed genes (Etchberger et al., 2007).

Even though the gene family analyses provided here offers just a glimpse of neuronal molecular diversity, one pervasive theme emerges - the expansion of many gene families with specific neuronal functions. This expansion is not strictly C. elegans-specific but can also be observed in C. briggsae and C. remanei, suggesting that the expansions occurred more than 100 million years ago. The scale of expansion becomes even more impressive if one considers that the progression from invertebrates to vertebrates was accompanied by two genome duplications. Thus, in theory, any worm gene should have four vertebrate orthologs, as is indeed often observed. Yet the picture is dramatically different in many of the gene families discussed here. Even cases with similar overall gene numbers (e.g., 21 Gα genes in worms compared to around 21 Gα genes in vertebrates) argue already for gene family expansions in worms (i.e., if there are ~20 Gα genes in vertebrates one would have expected only 5 genes in worms). Many more dramatic expansions are apparent, including the expansion of the two pore TWK channel family, the Cys-loop ligand gated ion channel family, chloride channels, DEG/ENaC channels, specific subfamilies of the SLC-type transporters (both vesicular transport and synaptic reuptake), neuropeptide-processing enzymes and others. Notably though, cell adhesion families (IgSF, LRRs, cadherins) have not expanded in worms. Many more genes of these types can be found in mammals, a likely testament to the increased morphological diversity of vertebrate neuron types.

Gene family expansion in worms is also apparent if one stays closer to home and only compares worms and flies, both members of the ecdysozoa clade. Even though the Drosophila nervous system contains more than 300 times as many neurons as C. elegans (~100,000 vs. 302), most gene family expansions discussed above are observed in worms, but not flies. This observation supports the notion that the larger worm gene numbers in specific gene families are not the result of gene loss in other genomes but more likely reflect gene family expansion in a specific invertebrate phylum. Within the nematodes, neuronal gene family expansions are not restricted to C. elegans, but can be observed in C. briggsae and C. remanei, often (but not always) with clear one-to-one ortholog matches. Whether this holds for distant nematode species can only be assessed once other nematode genome sequences are better annotated.

There are three common functional themes in the gene family expansions. First, the expansion in the vesicular and reuptake transporters together with the expansion of ligand-gated ion channels suggests the existence of as yet to be discovered neurotransmitter/neuromodulator systems. Large GPCR-subfamilies, like the srw gene family (with similarities to neuropeptide receptors), as well as the pervasive non-sensory neuron expression of many olfactory-type GPCRs appear to make the same point. The ability to tune membrane potentials with the hugely expanded two pore TWK potassium gene family is also consistent with a tremendous adaptability of C. elegans neurons to various types of signaling inputs. Second, the worm has clearly expanded its repertoire of neuronal mechanisms with which it monitors its environment. This expanded sensory repertoire includes sensory GPCR proteins, GCY proteins, globins and ion channels (most notably the DEG/ENaC/ASIC family). This is consistent with the recognition of many sensory modalities being tuned over very narrow ranges. These expanded sensory functions are also likely to reflect the complex and highly variable sensory environment in which nematodes find themselves. Intriguingly, even the expansion of the ligand-gated ion channels may possibly be related to sensory functions, as exemplified by the choline-sensing DEG-3 and DES-2 LGICs. Last, the expanded neuronal “toolbox” suggests that in its molecular composition each C. elegans neuron may be a much more complex information processing device than a fly or vertebrate neuron. Understanding the functions of each individual component of this toolbox and understanding how the expression of all these genes is regulated is a daunting challenge.

14. Tables 2-35

Table 2: Potassium channels (72 genes)

Topology Family Gene (alt. name) Expression1
6-transmembrane Voltage-gated: Shaker/Kv1 subfamily (1 gene) shk-1 number of interneurons and sensory neurons
  Voltage-gated: Shab/Kv2 subfamily (6 genes) exp-2 muscle, sensory neurons
    kvs-1 motorneurons, sensory neurons
    kvs-2 ?
    kvs-3 ?
    kvs-4 ?
    kvs-5 ?
  Voltage-gated: Shaw/Kv3 subfamily (3 genes) shw-1 ?
    egl-36 (shw-2) subset of neurons, muscle
    shw-3 (kht-1) subset of neurons
  Voltage-gated: Shal/Kv4 subfamily (1 gene) shl-1 neurons
  KQT family (3 genes) kqt-1 subset of neurons
    kqt-2 intestine
    kqt-3 sensory neurons
  Eag-like/Kv10-12 family (2 genes) egl-2 sensory neurons, muscle
    unc-103 many neurons, muscle
  Calcium-activated Slo family (2 genes) slo-1 (nsy-3) subset of neurons, muscle
    slo-2 subset of neurons
  Calcium-activated SK family (4 genes) kcnl-1 ?
    kcnl-2 subset of neurons
    kcnl-3 ?
    kcnl-4 ?
4-transmembrane TWK Channel family (47 genes) egl-23 (twk-41) ?
    sup-9 (twk-38) subset of neurons, muscle
    unc-58 motorneurons, interneurons
    unc-110 (twk-18) muscle only
    twk-1 hypoderm
    twk-2 subset of neurons
    twk-3 subset of neurons
    twk-4 subset of neurons
    twk-5 ?
    twk-6 neurons + others
    twk-7 ?
    twk-8 (twk-19) muscle
    twk-9 ?
    twk-10 ?
    twk-11 ?
    twk-12 ?
    twk-13 (twk-15) ?
    twk-14 ?
    twk-16 subset of neurons
    twk-17 subset of neurons
    twk-20 neurons and muscle
    twk-21 ?
    twk-22 pharynx
    twk-23 neurons + others
    twk-24 ?
    twk-25 ?
    twk-26 ?
    twk-28 muscle
    twk-29 subset of neurons
    twk-30 subset of neurons
    twk-31 ?
    twk-32 subset of neurons
    twk-33 ?
    twk-34 ?
    twk-35 ?
    twk-36 excretory cell
    twk-37 ?
    twk-39 ?
    twk-40 ?
    twk-42 ?
    twk-43 ?
    twk-44 ?
    twk-45 ?
    twk-46 neurons
    twk-47 ?
    twk-48 ?
    twk-49 ?
2-transmembrane Kir family (3 genes) irk-1 small number of neurons
    irk-2 head neurons
    irk-3 head neurons

1From www.wormbase.org

Table 3: Candidate auxiliary subunits for various types of ion channels (93 genes)

Auxiliary subunit for Gene Domains/homology Experimentally confirmed Expression1
Voltage-gated K+ channels mps-1 KCNE/MinK ortholog yes subset of neurons
  mps-2   yes subset of neurons
  mps-3   yes subset of neurons
  mps-4   yes subset of neurons
  K01A2.9 mps-2-related no ?
  K01A2.12   no ?
  K01A2.3   no ?
  K01A2.4   no ?
  C25A8.2 mps-3-related no ?
  R02D5.7   no ?
  F30A10.12   no ?
  F53G12.8   no ?
  sssh-1 Drosophila sleepless ortholog no ?
  ncs-6 KChIP1,2,3,4 related no ?
  ncs-7   no ?
  ncs-8   no ?
  ncs-9   no ?
  dpf-1 Dipeptidyl-peptidase IV-like peptidases no ?
  dpf-2   no muscle, seam cells
  dfp-3   no ?
  dpf-4   no ?
  dpf-5   no Intestine, rectal gland cells
  dpf-6   no Pharynx, intestine
  dpf-7   no ?
  mec-14 oxidoreductase no ?
  ctf-1 ABCC ortholog, unclear which subfamily no excretory cell
  mrp-1 ABCC1/2/6-ortholog (SUR = ABCC8/9) no some neurons, pharynx, intestine, hypodermis
  mrp-2   no some neurons, pharynx, intestine, excretory cell
  mrp-3   no ?
  mrp-4   no intestine
  mrp-7   no neurons, muscle, intestine
  mrp-8   no ?
  mrp-5 ABCC5/1/12 ortholog no neurons, pharynx, intestine, muscle, hypodermis
  mrp-6 ABCC4 ortholog no neurons, intestine
  bkip-1 no ortholog yes neurons, muscle
TWK-type potassium channels sup-10 muscle
  unc-93 human UNC-93A no neurons, muscle
  Y39B6A.27 human UNC93-like MFSD11 (TF315284) no ?
  Y39B6A.29     ?
  Y37A1A.2     ?
  ZK6.6     ?
  ZK6.8     ?
  B0554.5     excretory cell
  B0554.7     ?
  C08D8.1     ?
  C27C12.4     neurons
  F31D5.2     ?
  F31D5.1     ?
  M153.2     ?
  Y11D7A.3     ?
  Y39D8A.1     ?
  Y52E8A.4     ?
  F36G9.3     ?
Voltage-gated calcium channels unc-79 none yes neurons
  unc-80   yes neurons
nAChR LGIC lev-9 multiple Sushi/CCP domains yes (worms) neurons
  T07H6.4   no ?
  lev-10 CUB domains, LDL domain, TM domain2 yes NMJ
  mig-13   no subset of neurons
  neto-1   no ?
  K05C4.11   no ?
  molo-1 TPM domain   yes muscle
  R02D5.3   no ?  
  F15B9.10   no ?  
  F01D5.6   no ?  
  Y54E2A.10   no ?  
  Y12A6A.1   no ?  
  C09B8.3   no ?  
  lurp-1 Lynx/SLURP orthologs (LU domain) yes (vertebrates) neurons  
  lurp-2     ?  
  lurp-3     ?  
  lurp-4     ?  
  odr-2 Ly6-related domain no subset of neurons  
  hot-1   no ?  
  hot-2   no ?  
  hot-3   no ?  
  hot-4   no subset of neurons  
  hot-5   no ?  
  hot-6   no ?  
  hot-7   no ?  
  hot-8   no ?  
  hot-9   no ?  
AMPA-type Glu receptors (TARPs) sol-1 CUB domains yes neurons  
  stg-1 stargazin-orthologs yes subset of neurons  
  stg-2   yes subset of neurons  
  lev-10 CUB domains, LDL domain, TM domain2 no subset of neurons  
  mig-13   no subset of neurons  
  neto-1   no ?  
  K05C4.11   no ?  
  cni-1 Cornichon-ortholog   no ?
DEG/ENaC channels (and perhaps TRP) mec-2 stomatin yes neurons  
  unc-1   no neurons  
  unc-24   no neurons  
  stl-1   no ?  
  sto-1   no pharynx, neurons  
  sto-2   no ?  
  sto-3   no ?  
  sto-4   no neurons  
  sto-5   no ?  
  sto-6   no ?  

1From www.wormbase.org

2One family member, lev-10, encodes a confirmed auxiliary subunit for C. elegans nAChRs. Another one, neto-1, encodes the worm ortholog of vertebrate Neto proteins with are auxiliary subunits for glutamate receptors. Whether other members of this family are also auxiliary subunits and for which type of channel, is not yet known.


Table 4: Voltage-gated calcium channels (9 genes)

Subunit Gene Type Expression1
α1 egl-19 HVA: L-type neurons, muscle, hypodermis
unc-2 HVA: non-L-type neurons
cca-1 LVA: T-type neurons, pharynx
nca-1 α1U type neurons
nca-2 α1U type neurons
α2δ unc-36   neurons, muscle
tag-180   neurons
β ccb-1   pharynx, muscle
ccb-2   ?

1From www.wormbase.org

Table 5: SLC transporters with confirmed or putative neuronal functions (82 genes)

SLC class Gene Subfamily Likely substrate Expression1
SLC17: Vesicular glutamate transporter family (14 genes) eat-4 SLC17A6-8 glutamate neurons
vglu-2 glutamate ?
vglu-3 glutamate neurons
ZK54.1   ?
C38C10.2 SLC17A1-5 Asp/Glu head/tail neurons
vnut-1 SLC17A9 nucleotides ?
C02C2.4 no specific subfamily, worm-specific expansion   neurons, intestine
T28F3.4 ?
F21F8.11 ?
F12B6.2 ?
ZK682.2 ?
F25G6.7 ?
T09B9.2 ?
F45E4.11 ?
SLC18: Vesicular amine transporter family (2 genes) cat-1   dopamine, histamine, serotonin, tyramine, octopamine all DA, 5HY, Tyr, Oct neurons
unc-17   acetylcholine all cholinergic neurons
SLC32: Vesicular inhibitory amino acid transporter family (GABA & Glycine) (1 gene) unc-47   GABA all GABA neurons
SLC1: High-affinity glutamate and neutral amino acid transporter family (6 genes) (REUPTAKE) glt-1   glutamate muscle
glt-3 excretory canal
glt-4 some neurons, pharynx
glt-5 pharynx
glt-6 pharynx, excretory canal
glt-7 excretory canal
SLC6: Na+/Cl- dependent neurotransmitter transporter family (17 genes) (REUPTAKE) dat-1 SLC6A2,3,4 dopamine DA neurons
mod-5 serotonin subset of 5HT neurons
snf-1     ?
snf-2 cluster 1   ?
snf-3     neurons, excretory system
snf-4 cluster 1   ?
snf-5 cluster 1   neurons, intestine
snf-6   acetylcholine/choline muscle
snf-7 cluster 1   ?
snf-8 cluster 1   ?
snf-9 cluster 1   ?
snf-10     ?
snf-11 (gat-1) SLC6A1 GABA neurons, muscle
snf-12     vesicles in hypodermis
F56F4.3 low similarity   ?
C09E8.1   ?
Y43D4A.1   ?
SLC28: Na+ coupled nucleoside transporter family (2 genes) (REUPTAKE ?) F27E11.1/slc-28.1   nucleosides ?
F27E11.2/slc-28.2 ?
SLC29: Facilitative nucleoside transporters (includes low affinity, high capacity monoamine transporters) (7 genes) (REUPTAKE) ent-1   Monoamines, others pharynx, intestine
ent-2 pharynx, intestine
ent-3 ?
ent-4 ?
ent-5 ?
ent-6 ?
ent-7 ?
SLC8 & SLC24: Na+/Ca2+ exchanger & Na+/Ca2+-K+ exchanger (10 genes) ncx-1 SLC8 Na+/Ca2+ exchanger ?
ncx-2 ?
ncx-3 ?
ncx-4 SLC24 Na+/Ca2+-K+ exchanger ?
ncx-5 ?
ncx-6 ?
ncx-7 ?
ncx-8 ?
ncx-9 ?
ncx-10 ?
SLC30: cation diffusion facilitator (CDF) family (12 genes) cdf-2 SLC30A2,3,4,8   intestine
ttm-1 ?
cdf-1 SLC30A1,10   muscle and intestine
Y105E8A.3 SLC30A5,7   ?
toc-1 SLC30A6   subset of neurons
Y71H2AM.9 SLC30A9   ?
F41C6.7 diverse   ?
F56C9.3 ?
K07G5.5 ?
PDB1.1 ?
ZK185.5 ?
R02F11.3 ?
SLC12: cation-chloride cotransporter family (7 genes) kcc-1 SLC12A4-6 K+/Cl- transporter muscle, neurons, intestine
kcc-2 muscle, neurons
kcc-3 glial cells
nkcc-1 SLC12A1,2 Na+/K+/2Cl transporter ?
F10E7.9   Na+/K+/2Cl transporter? neurons
B0303.11   Na+/K+/2Cl transporter? excretory system
T04B8.5 SLC12A9   neurons
SLC4: Cl–HCO3 exchangers (4 genes) abts-1 SLC4A7-10   neurons, hypodermis, muscle
abts-2 SLC4A11   subset of neurons
abts-3   neurons, hypodermis
abts-4 SLC4A1-3   subset of neurons

1From www.wormbase.org

Table 6: Calcium binding proteins – the “EF hand-only” proteins (65 genes)

Family Gene (alt. name) Homolog (and/or domains) Expression1
Calmodulin family (9 genes) cmd-1 calmodulin (best hit) ?
  cal-1 calmodulin-like ?
  cal-2   ?
  cal-3   ?
  cal-4   muscle
  cal-5 ?  
  cal-6   ?
  cal-7   ?
  cal-8   ?
NCS family (7 genes) ncs-1 human NCS-1 subset of neurons
  ncs-2 human recoverin ?
  ncs-3 human NCS-1 ?
  ncs-4 KChIP1,2,3,4/DREAM ?
  ncs-5   ?
  ncs-6   ?
  ncs-7   ?
Others (49 genes) cnb-1 calcineurin (regulatory subunit of PP2B) neurons, muscle
  rsa-1 PP2A regulatory subunit ?
  C06G1.5   muscle
  cex-1 Calexcitin neurons
  cex-2   neurons, hypodermis, pharynx
  efdh-1 EFHD1/2 ortholog neurons, muscle, pharynx
  F59D6.7 distant EFHD/AIF ?
  R08D7.5 Centrin (caltractin) 1/2/3 ortholog only neurons
  C56C10.9 human SDF4 ?
  F55A11.1 human MCFD2 ?
  T04F3.4 human MCFD2 (distant) ?
  C29E4.14 human MCFD2 (distant) ?
  ZK856.8 human CHP1/2 ortholog ?
  pbo-1 ?  
  F59D6.7 ?  
  micu-1 calcitonin-related ?
  calm-1 CIB ortholog (Calcium and integrin binding) ?
  calu-1 human CALU (Reticulocalbin) muscle, intestine, pharynx
  calu-2   ?
  nucb-1 nucleobindin neurons, muscle
  T04F8.6 ninein ortholog ?
  reps-1 RALBP1 ?
  mlc-1 myosin light chain muscle
  mlc-2   muscle
  mlc-3   muscle
  mlc-4   ?
  mlc-5   neurons, hypodermis, intestine
  mlc-6   ?
  mlc-7   ?
  tnc-1 (pat-10) Troponin muscle
  tnc-2   pharynx
  F43C9.2 distally related to troponin ?
  cbn-1 none specifically ?
  B0563.7 none specifically ?
  C50C3.5 none specifically ?
  C56A3.6 none specifically ?
  E02A10.3 none specifically neurons, intestine
  F23F1.2 none specifically ?
  H10E21.4 none specifically ?
  K03A1.4 none specifically ?
  M04F3.4 none specifically ?
  R09H10.6 none specifically subset of neurons
  T03F1.11 none specifically ?
  T04F3.4 none specifically ?
  Y73C8B.5 none specifically reproductive system
  F16F9.3 none specifically ?
  T09B4.4 none specifically ?
  T02G5.2 none specifically ?

EF hands show similarity with EH domains. Genes were not included where InterPro predicted the same domain to be similar to EF hands and to EH domains.

1From www.wormbase.org

Table 7: TRP channels (23 genes)

Family Gene (alt. name) Expression1
TRPV ocr-1 sensory neurons
ocr-2 sensory neurons
ocr-3 gland cells
ocr-4 sensory neurons
osm-9 sensory neurons
TRPP lov-1 sensory neurons
pkd-2 sensory neurons
TRPN trp-4 sensory neurons
TRPML cup-5 broad
TRPM ced-11 hypodermis
gon-2 pharynx, excretory cell, intestine
gtl-1 neurons, intestine
gtl-2 excretory cell
TRPC trp-1 neurons, muscle
trp-2 neurons
spe-41 (trp-3) sperm
TRPA trpa-1 neurons
trpa-2 neurons
TRPM-related2 (TF315286) trpl-1 ?
trpl-2 ?
trpl-3 ?
trpl-4 ?
trpl-5 ?

1From www.wormbase.org

2A subfamily of genes related to one another (PTHR13800 (“Transient Receptor Potential Cation Channel, Subfamily M”)), as identified with InterproScan.

Table 8: Cyclic nucleotide gated channels (6 genes)

Gene Homology Expression1
tax-2 α subunit subset of sensory neurons
tax-4 β subunit subset of sensory neurons
cng-1 α/β subset of sensory neurons
cng-2 α/β ?
cng-3 α/β subset of sensory neurons
che-6 α/β subset of sensory neurons

1From www.wormbase.org

Table 9: nAChR-type ligand-gated ion channels of the Cys-loop LGIC superfamily (61 genes)

Gene Gated by (as confirmed in vitro) Expression1
acr-2 ACh subset of neurons
acr-3 ACh subset of neurons (possible operon with acr-2)
acr-5   subset of neurons
acr-6   ?
acr-7   subset of neurons
acr-8   subset of neurons
acr-9   ventral cord neurons
acr-10   ?
acr-11   subset of neurons
acr-12 ACh exclusively in neurons, including ventral cord neurons
acr-13/lev-8 ACh subset of neurons
acr-14   subset of neurons
acr-15   subset of neurons
acr-16   muscle, some neurons
acr-17   ?
acr-18   ?
acr-19   ?
acr-20   ?
acr-21   ?
acr-23   ?
acr-24   ?
acr-25   ?
cup-4   broad
deg-3   subset of neurons
des-2/acr-4   subset of neurons
eat-2   pharyngeal muscles
lev-1 ACh muscle, some neurons in ventral cord
unc-29 ACh muscle, some neurons
unc-38 ACh muscle, many neurons
unc-63 ACh muscle, many neurons
lgc-1   ?
lgc-2/pbo-5 protons subset of neurons, muscle
lgc-3/pbo-6 protons muscle
lgc-4   ?
lgc-5   ?
lgc-6   ?
lgc-7   ?
lgc-8   ?
lgc-9   ?
lgc-10   ?
lgc-11   ?
lgc-12   ?
lgc-13   ?
lgc-14   ?
lgc-15   ?
lgc-16   ?
lgc-17   ?
lgc-18   ?
lgc-19   ?
lgc-20   ?
lgc-21   ?
lgc-22   ?
lgc-23   ?
lgc-24   ?
lgc-25   ?
lgc-26   ?
lgc-27   ?
lgc-28   ?
lgc-29   ?
lgc-30   ?
lgc-31   ?

1From www.wormbase.org

Table 10: Other ligand-gated ion channels of the Cys-loop LGIC superfamily (41 genes)

Subgroup Gene Experimentally confirmed ligand Expression1
GABA subgroup (7 genes) gab-1 GABA ?
unc-49 GABA muscle only
exp-1 GABA subset of neurons, muscle
lgc-35   motorneurons, muscle
lgc-36   ?
lgc-37   ?
lgc-38   ?
Aminergic subgroup (8 genes) mod-1 serotonin subset of neurons
lgc-50   ?
lgc-51   ?
lgc-52   ?
lgc-53 dopamine ?
lgc-54   ?
lgc-55 tyramine subset of neurons
ggr-3   subset of neurons
GluCl subgroup (6 genes) avr-14 glutamate subset of neurons
avr-15 glutamate subset of neurons
glc-1 glutamate ?
glc-2 glutamate pharynx only
glc-3   head neurons
glc-4   head neurons
ACC subgroup (8 genes) acc-1 acetylcholine ?
acc-2 acetylcholine ?
acc-3 acetylcholine ?
acc-4 acetylcholine neurons
lgc-46   motor neurons, muscle
lgc-47   ?
lgc-48   ?
lgc-49   ?
diverse (12 genes) lgc-32   ?
lgc-33   ?
lgc-34   muscle
ggr-1   subset of neurons
ggr-2   subset of neurons
lgc-39   ?
lgc-40   ?
lgc-41   ?
lgc-42   ?
lgc-43   ?
lgc-44   ?
lgc-45   ?

1From www.wormbase.org

Table 11: Ionotropic glutamate receptors (15 genes)

Type Gene Expression1
NMDA-type nmr-1 subset of neurons
nmr-2 subset of neurons
AMPA-type glr-1 subset of neurons
glr-2 subset of neurons
glr-3 subset of neurons
glr-4 subset of neurons
glr-5 subset of neurons
glr-6 subset of neurons
glr-7 subset of neurons
glr-8 subset of neurons
Diverse2 ZK867.2 ?
C08B6.5 ?
W02A2.5 subset of neurons
F59E12.8 ?
T25E4.2 ?

1From www.wormbase.org

2Clear homology to GLR receptors, containing predicted extracellular solute-binding protein domain (like GLRs), but lacking some other core sequence features of GLRs (Brockie et al., 2001).

Table 12: DEG/ENaC channels (32 genes)

Gene Cosmid name Notes on homologies or domains Expression1
flr-1 F02D10.5 Subgroup 1 intestine
acd-1 C24G7.2 amphid sheath
acd-2 C24G7.4 ?
acd-3 C27C12.5 ?
acd-4 F28A12.1 ?
acd-5 T28F2.7 ?
delm-1 F23B2.3 ?
delm-2 C24G7.1 ?
asic-1 ZK770.1 Subgroup 2 neurons
asic-2 T28F4.2 ?
mec-4 T01C8.7 subset of neurons
del-4 T28B8.5 subset of neurons
unc-105 C41C4.5 muscle
deg-1 C47C12.6 subset of neurons
del-1 E02H4.1 subset of neurons
mec-10 F16F9.5 subset of neurons
egas-1 Y69H2.11 Subgroup 3 (EGF + ASC domain) ?
egas-2 Y69H2.12 ?
egas-3 Y69H2.2 ?
egas-4 F55G1.12/13 (fused) ?
degt-1 F25D1.4   PVD
del-2 F58G6.6   subset of neurons
del-3 F26A3.6   neurons
del-5 F59F3.4   ?
del-6 T21C9.3   ?
del-7 C46A5.2   ?
del-8 C11E4.3   ?
unc-8 R13A1.4   neurons
del-9 C18B2.6   ?
del-10 T28D9.7   ?
  Y57G11C.44 short fragment, pseudogene? ?
  F58G6.8 short fragment, pseudogene? ?

1From www.wormbase.org

Table 13: Chloride channels (35 genes)

Type Gene (alt. name) Expression1
CLC-type (6 genes) clh-1 (clc-1) hypoderms, neurons
clh-2 (clc-2) head, tail neurons, vulval muscles
clh-3 (clc-3) Excretory cell, vulva, neurons, enteric muscles, epithelial cells
clh-4 (clc-4) excretory cell
clh-5 (clc-5) pharynx, intestine, hypodermis, unidentified cells in head and tail
clh-6 (clc-6) pharynx, intestine, excretory cell, neurons
anoctamin-related (2 genes) anoh-1 ?
anoh-2 neurons
tweety-related (1 gene) ttyh-1 broadly in neurons
bestrophin-related (26 genes) best-1 ?
best-2 ?
best-3 hypodermis, excretory cell, muscle
best-4 ?
best-5 ?
best-6 ?
best-7 ?
best-8 ?
best-9 ?
best-10 ?
best-11 ?
best-12 ?
best-13 neurons, intestine
best-14 ?
best-15 ?
best-16 ?
best-17 ?
best-18 ?
best-19 ?
best-20 ?
best-21 ?
best-22 ?
best-23 ?
best-24 neurons, intestine, hypodermis
best-25 ?
best-26 ?

1From www.wormbase.org


Table 14: Neurotransmitter synthesis and degradation (36 genes)

Process Gene (alt. name) Enzymatic activity Expression1
ACh synthesis cha-1 cholineacetyltransferase ACh neurons
acly-1 ATP citrate lyase ?
acly-2 ?
pmt-1 phosphoethanolamine N-methyltransferase hypodermis
pmt-2 ?
GABA synthesis unc-25 glutamic acid decarboxylase (GAD) GABA neurons
Biogenic amine synthesis2 tph-1 tryptophan hydroxylase (requires BH4 cofactor) 5HT neurons
tbh-1 tyramine beta hydroxylase OA neurons (RIC)
cat-2 tyrosine hydroxylase (requires BH4 cofactor) dopamine neurons
cat-4 GTP cyclohydrolase (for BH4 synthesis2) 5HT + dopamine neurons
gfrp-1 GTP cyclohydrolase feedback regulator (for BH4 synthesis2) ?
ptps-1 6-pyruvoyl-tetrahydropterin synthase (for BH4 synthesis2) ?
pcbd-1 pterin-4-alpha-carbinolamine dehydratase (BH4 recycling2) ?
qdpr-1 quinoid dihydropteridine reductase (BH4 recycling2) ?
tdc-1 aromatic amino acid decarboxylase (AAAD) Tyr + OA neurons (RIM, RIC)
bas-1 5HT + dopamine neurons
hdl-1 ?
hdl-2 (tag-19) ?
basl-1 inactive aromatic amino acid decarboxylase ?
anat-1 arylalkylamine N-acetyltransferase subset of neurons
homt-1 hydroxyindole-O-methyltransferase PVT + uterine cells
anmt-1 amine N-methyltransferase (PNMT, INMT, NNMT) muscle, intestine, pharynx
anmt-2 pharynx
anmt-3 muscle, pharynx
ACh degradation ace-1 ACh esterase some neurons, muscle
ace-2 subset of ACh neurons
ace-3 some neurons, muscle
ace-4 some neurons, muscle
Monoamine degradation amx-2 MAO-A/B ?
comt-1 Catechol-O-Methyltransferase (COMT) homologs ?
comt-2 ?
comt-3 ?
comt-4 ?
comt-5 ?
GABA degradation gta-1 GABA transaminase ?
adh-7 succinic semialdehyde dehydrogenase ?

Ach = acetylcholine, 5HT = serotonin, DA = dopamine, Tyr = tyramine, OA = octopaminergic

1From www.wormbase.org

2See Figure 6 and Figure 7. Note that the involvement in biogenic amine synthesis of some of these genes is not at all proven (e.g., anmt genes).

Table 15: Neuropeptide-encoding genes (122 genes)

Type Genes Expression1
Insulin-related peptides (40 ins genes) daf-28, ins-1 through ins-39
  • 15 expression patterns:

  • 9/15 neuron-restricted

  • 6/15 neurons + non-neurons

FMRFamides (31 flp genes) flp-1 through flp-28, flp-32, flp-33, flp-44
  • 19 expression patterns

  • 15/19 neuron-restricted

  • 4/19 neurons + non-neurons

Neuropeptide-like proteins (51 genes) nlp-1 through nlp-48
  • 29 expression patterns

  • 2/29 neuron-restricted

  • 6/29 non-neuronal only

  • 21/29 neurons + non-neurons

pdf-1 subset of neurons
snet-1 subset of neurons
ntc-1 subset of neurons

1From www.wormbase.org

Table 16: Metabolism of neuropeptides (47 genes)

Process Gene Type of protein Expression1
Maturation (7 genes) egl-3 PC-type proprotein convertase many, but not all neurons
kpc-1 ?
bli-4 neurons, hypodermis
aex-5 muscle
egl-21 Carboxypeptidase E neurons
cpd-1 Carboxypeptidase D pharynx
cpd-2 ?
Modification (3 genes) pghm-1 Peptidylglycine alpha-amidating monooxygenase ?
pgal-1 ?
pamn-1 neurons
Degradation (37 genes) acn-1 ACE-like protein (catalytically inactive) hypodermis
nep-1 neprilysin pharynx, neuron
nep-2 muscle, glia, neurons
nep-3 ?
nep-4 ?
nep-5 ?
nep-6 ?
nep-7 ?
nep-8 ?
nep-9 ?
nep-10 ?
nep-11 ?
nep-12 ?
nep-13 ?
nep-14 ?
nep-15 ?
nep-16 ?
nep-17 intestine
nep-18 ?
nep-19 ?
nep-20 ?
nep-21 neurons
nep-22 ?
nep-23 ?
nep-24 ?
nep-25 ?
nep-26 ?
nep-27 ?
dpf-1 Dipeptidyl-peptidase IV ?
dpf-2 muscle, seam cells
dfp-3 ?
dpf-4 ?
dpf-5 intestine, rectal gland cells
dpf-6 pharynx, intestine
dpf-7 ?
dpt-1 Dipeptidyl-peptidase III ?
tpp-2 Tripeptidyl peptidase II neurons, intestine

1From www.wormbase.org

Table 17: Insulin/EGF receptor-like proteins (70 genes)

Gene Localization1 Expression2
daf-2 transmembrane or membrane associated very broad
hpa-1 subset of neurons, glia
irld-1 ?
irld-2 ?
irld-3 ?
irld-4 ?
irld-5 ?
irld-6 ?
irld-7 ?
irld-8 ?
irld-9 ?
irld-10 ?
irld-11 ?
irld-12 ?
irld-13 ?
irld-14 ?
irld-15 ?
irld-16 ?
irld-17 ?
irld-18 ?
irld-19 ?
irld-20 ?
hpa-2 secreted subset of neurons, glia
irld-21 ?
irld-22 ?
irld-23 ?
irld-24 ?
irld-25 ?
irld-26 ?
irld-27 ?
irld-28 ?
irld-29 ?
irld-30 ?
irld-31 ?
irld-32 ?
irld-33 ?
irld-34 ?
irld-35 ?
irld-36 ?
irld-37 ?
irld-38 ?
irld-39 ?
irld-40 ?
irld-41 ?
irld-42 ?
irld-43 ?
irld-44 ?
irld-45 ?
irld-46 ?
irld-47 ?
irld-48 ?
irld-49 ?
irld-50 ?
irld-51 ?
irld-52 ?
irld-53 ?
irld-54 ?
irld-55 ?
irld-56 ?
irld-57 ?
irld-58 ?
irld-59 ?
irld-60 ?
irld-61 ?
irld-62 ?
irld-63 ?
irld-64 ?
irld-65 ?
irld-66 ?
irld-67 ?

1As determined by TMHMM search (http://www.cbs.dtu.dk/services/TMHMM/)

2From www.wormbase.org

Table 18: The five classes of GPCRs

Name Subtypes Gene number in Gene Identity
C. elegans humans1
Rhodopsin (Class A) biogenic amine 16 ∼200 Table 19
muscarinic (ACh) 3 Table 19: gar-1,2,3
putative peptidergic >153 Table 20
chemosensory and others ∼1,280 ∼400 The putative chemoreceptor families of C. elegans
Secretin (Class B)   3 15 Table 20: pdfr-1, seb-2, seb-3
Adhesion (Class B)   5 33 Table 22: lat-1, lat-2, fmi-1, mth-1, mth-2
Glutamate receptor (Class C)   6 (7)2 22 Table 19: mgl-1,2,3, gbb-1,2, F35H10.10 (C30A5.10)2
Frizzled/Taste2   4 26 mig-1, lin-17, mom-5, cfz-2

1According to (Lagerstrom and Schioth, 2008).

2The assignment of this gene into this class is ambiguous (see text).

Table 19: Metabotropic neurotransmitter receptors (27 genes)

Type based on sequence homology Gene Ligand Expression1
metabotropic Glutamate receptor (5 genes) mgl-1 glutamate subset of neurons
mgl-2 neurons
mgl-3 subset of neurons
C30A5.10 ? ?
F35H10.10 ? ?
mAChRs (muscarinic acetylcholine receptors) (3 genes) gar-1 acetylcholine subset of neurons
gar-2 subset of neurons
gar-3 pharynx
metabotropic GABA receptors (2 genes) gbb-1 GABA broadly in nervous system
gbb-2 ?
biogenic amine receptor (16 genes) dop-1 dopamine subset of neurons
dop-2 subset of neurons
dop-3 subset of neurons
dop-4 ? subset of neurons
dop-5 ? subset of neurons
dop-6 ? subset of neurons
octr-1 octopamine subset of neurons
ser-3 subset of neurons, muscle
ser-6 subset of neurons
ser-1 serotonin subset of neurons, muscle
ser-4 subset of neurons
ser-5 subset of neurons, muscle
ser-7 subset of neurons
ser-2 tyramine subset of neurons, muscle
tyra-2 subset of neurons
tyra-3 subset of neurons
adenosine receptor ador-1 adenosine (?) ?

1From www.wormbase.org

Table 20: GPCR-type putative neuropeptide receptors and their grouping into families (153 genes)

Cosmid name Gene name Best human/fly neuropeptide receptor hit in BLAST search2 e-value2 Known ligand3 Expression4
Class B (secretin-type)1
C13B9.4 pdfr-1 fly PDF receptor −91 NLP-37 neurons, muscle
ZK643.3 seb-2 human calcitonin receptor −34   muscle
C18B12.2 seb-3 corticotropin-releasing factor receptor −44   neurons
Class A (rhodopsin-type)1
Neuropeptide F/Y receptor family. From analysis in (Cardoso et al., 2012) and from TF350004, expanded with genes in TF315303.
C39E6.6 npr-1 fly neuropeptide F receptor −50 FLP-18,21 neurons
T05A1.1 npr-2 fly neuropeptide F receptor −47   ?
C10C6.2 npr-3 fly neuropeptide F receptor −40 FLP-10 neurons
C16D6.2 npr-4 fly neuropeptide F receptor −48 FLP-4,10 neurons
Y58G8A.4 npr-5 fly neuropeptide F receptor −53 FLP-1,2 neurons
F41E7.3 npr-6 fly neuropeptide F receptor −65   neurons
F35G8.1 npr-7 fly neuropeptide F receptor −35   ?
C56G3.1 npr-8 fly neuropeptide F receptor5 −29   ?
C53C7.1 npr-10 fly neuropeptide F receptor −49 FLP-3 ?
C25G6.5 npr-11 fly neuropeptide F receptor −50 NLP-1 ?
T22D1.12 npr-12 fly neuropeptide F receptor −41   ?
ZC412.1 npr-13 fly neuropeptide F receptor −39   neurons, intestine
Ghrelin-obstatin/neuromedin U receptor family. From (Cardoso et al., 2012).
C48C5.1 nmur-1 human neuromedin receptor5 −48   subset neurons
K10B4.4 nmur-2 human neuromedin receptor −48 NLP-44 ?
F02E8.2 nmur-3 fly capa receptor −39   ?
C30F12.6 nmur-4 thyrotropin-releasing hormone receptor −70   pharynx, intestine
T07D4.1 npr-20 fly Tachykinin-like receptor −28   ?
T23C6.5 npr-21 human neuropeptide FF receptor −28   neurons
Neurokinin/neuropeptide FF/orexin receptor family. From (Cardoso et al., 2012).
C38C10.1 tkr-1 tachykinin receptor −51   glia
C49A9.7 tkr-2 tachykinin receptor −69   ?
AC7.1 tkr-3 tachykinin receptor −56   neurons
W05B5.2 npr-14 human orexin receptor −32   ?
Y59H11AL.1 npr-22 fly neuropeptide Y receptor −47 FLP-7,11 ?
C50F7.1 npr-35 fly SIFR homolog −56   ?
Somatostatin receptor receptor family. From (Cardoso et al., 2012) and expanded with genes from TF334200
F56B6.5 npr-16 human somatostatin receptor −37   neurons
C06G4.5 npr-17 human somatostatin receptor −21   ?
C43C3.2 npr-18 human somatostatin receptor −18   ?
R106.2 npr-24 human somatostatin receptor −43   ?
T02E9.1 npr-25 human somatostatin receptor −24   ?
T02D1.6 npr-26 human somatostatin receptor −22   ?
F42C5.2 npr-27 human somatostatin receptor −19   ?
F55E10.7 npr-28 human somatostatin receptor −18   ?
ZC84.4 npr-29 human nociceptin receptor −12   ?
H10E21.2 npr-30 Somatostain receptor −12   ?
T07F8.2 npr-31 human Rfamide peptide receptor −10   ?
Y116A8B.5 npr-32 allatostatin C receptor −35   ?
Galanin receptor family. From (Cardoso et al., 2012) and expanded with npr-33 from TF350000
ZK455.3 npr-9 allostatin receptor −63   neurons
T27D1.3 npr-15 fly allatostatin receptor −31   ?
F31B9.1 npr-33 human galanin receptor −31   ?
Y54E2A.1 npr-34 human pyroglutamylated Rfamide receptor −32   ?
Gonadotropin-releasing hormone receptor family (TF106499)
F54D7.3 gnrr-1 GnRH receptor −49   neurons
C15H11.2 gnrr-2 fly FMRFamide receptor −08   ?
ZC374.1 gnrr-3 GnRH receptor −13   ?
C41G11.4 gnrr-4 GnRH receptor −16   ?
H22D07.1 gnrr-5 GnRH receptor −20   ?
F13D2.2 gnrr-6 human oxytocin receptor −16   ?
F13D2.3 gnrr-7 human serotonin receptor −14   ?
Y105C5A.23 daf-38 human vasopressin receptor −21   sensory neurons
Gastrin-cholecystokinin receptor family
T23B3.4 ckr-1 human CCK receptor −41   neurons
Y39A3B.5 ckr-2 CCK receptor −33 NLP-12 ?
Related Vasopressin receptor family
T07D10.2 ntr-1 human vasopressin receptor −34   neurons
F14F4.1 ntr-2 human vasopressin receptor −31   neurons
Related to Sex peptide receptor family
R03A10.6 sprr-1 fly sex peptide receptor −60   ?
F42D1.3 sprr-2 fly sex peptide receptor −55   ?
Y69A2AR.15 sprr-3 fly sex peptide receptor −30   ?
Drosophila FMRFamide receptor family (TF316702)
C02B8.5 frpr-1 fly FMRFamide receptor −18   ?
C05E7.4 frpr-2 fly FMRFamide receptor −17   ?
C26F1.6 frpr-3 fly FMRFamide receptor −32 FLP-7,11 ?
C54A12.2 frpr-4 fly FMRFamide receptor −33   ?
C56A3.3 frpr-5 fly FMRFamide receptor −33   ?
F21C10.12 frpr-6 fly FMRFamide receptor −37   ?
F39B3.2 frpr-7 fly FMRFamide receptor −24   ?
F53A9.5 frpr-8 fly FMRFamide receptor −35   ?
F53B7.2 frpr-9 fly FMRFamide receptor −22   ?
F57H12.4 frpr-10 fly FMRFamide receptor −28   ?
K06C4.8 frpr-11 fly FMRFamide receptor −25   ?
K06C4.9 frpr-12 fly FMRFamide receptor −25   ?
K06C4.17 frpr-13 fly FMRFamide receptor −12   ?
K07E8.5 frpr-14 fly FMRFamide receptor −24   ?
K10C8.2 frpr-15 fly FMRFamide receptor −27   ?
R12C12.3 frpr-16 fly FMRFamide receptor −24   ?
T14C1.1 frpr-17 fly FMRFamide receptor −38   ?
T19F4.1 frpr-18 fly FMRFamide receptor −43 FLP-2 ?
Y41D4A.8 frpr-19 fly FMRFamide receptor −34   ?
C30B5.5 daf-37 fly FMRFamide receptor −16   sensory neurons
Drosophila Dromyosuppressin receptor family (TF315509)
C46F4.1 egl-6 Dromyosuppressin receptor −20 FLP-10,17 neurons
F57B7.1 dmsr-1 Dromyosuppressin receptor −45   ?
Y23H5B.4 dmsr-2 Dromyosuppressin receptor −16   ?
Y48C3A.11 dmsr-3 Dromyosuppressin receptor −13   ?
D1069.4 dmsr-4 Dromyosuppressin receptor −18   ?
Y48A6B.1 dmsr-5 Dromyosuppressin receptor −15   ?
Y54G11B.1 dmsr-6 Dromyosuppressin receptor −18   ?
C35A11.1 dmsr-7 Dromyosuppressin receptor −44   ?
C35A5.7 dmsr-8 Dromyosuppressin receptor −20   ?
ZC404.13 dmsr-9 Dromyosuppressin receptor −12   ?
ZC404.10 dmsr-10 Dromyosuppressin receptor −05   ?
ZC404.11 dmsr-11 Dromyosuppressin receptor −13   ?
H34P18.1 dmsr-12 Dromyosuppressin receptor −14   ?
T15B7.12 dmsr-13 Dromyosuppressin receptor −14   ?
T15B7.11 dmsr-14 Dromyosuppressin receptor −14   ?
T15B7.13 dmsr-15 Dromyosuppressin receptor −06   ?
T27B2.1 dmsr-16 Dromyosuppressin receptor5 −18   ?
Another Drosophila FMRFamide receptor family (TF315321)
E04D5.2   fly FMRFamide receptor −15   ?
T11F9.1   fly FMRFamide receptor −10   ?
R11F4.2   fly FMRFamide receptor −08   ?
Y37E11AL.1   fly FMRFamide receptor −07   ?
C54D10.5   fly FMRFamide receptor −07   ?
ZK1307.7   fly SIFamide receptor −07   ?
F32D8.10   fly FMRFamide receptor −06   ?
F56A11.4   fly FMRFamide receptor −05   ?
Y41D4B.24   fly leucokinin receptor −05   ?
F57A8.4   fly methusaleh receptor −05   ?
Y40C5A.4   fly FMRFamide receptor −04   ?
C47E8.3   fly sex peptide receptor −04   ?
B0034.5   human Melanin-concent. hormone receptor −04   ?
Related family (TF315326) with fly ortholog (CG33639)
B0563.6   fly sex peptide receptor −13   ?
AH9.1   orexin receptor −12   ?
Y70D2A.1   fly FMRFamide receptor −11   ?
C17H11.1   adrenergic receptor −07   ?
Related family (TF317595) with fly ortholog (CG33696)
H09F14.1   fly peptide receptor −17   ?
F16C3.1   fly peptide receptor −14   ?
C24B5.1   fly peptide receptor −09   ?
Related family with no specific orthologs (TF315359)
R13H7.2   fly sex peptide receptor −14   neurons, intestine
K03H6.1   allatostatin receptor −10   ?
K03H6.5   growth hormone secretagogue receptor −10   ?
F40A3.7   fly sex peptide receptor −09   ?
W10C4.1   fly sex peptide receptor −08   ?
F59B2.13   proctolin receptor −08   neurons
T10E10.3   proctolin receptor −08   ?
D1014.2   allatostatin receptor −07   ?
Related family with no specific orthologs (TF315508)
C04C3.6   human cholecystokinin receptor −05   ?
ZK863.1   human cholecystokinin receptor −05   ?
C50H11.13   human cholecystokinin receptor −05   ?
C54E10.3   human cholecystokinin receptor −05   ?
T01B11.1   human neuropeptide S receptor −06   ?
Related family with no specific orthologs (TF316587)
T14B1.2 aex-2 human galanin receptor −06   neurons, muscle
C25B8.5 aexr-1 fly SIFamide receptor −10   ?
C25B8.7 aexr-2 human prokineticin receptor −11   ?
C48C5.3 aexr-3 CCK-like GPCR −05   ?
Related family with no specific orthologs (TF316160)
F52D10.4   fly FMRFamide receptor −05   ?
F56A12.2   fly FMRFamide receptor −07   ?
M04G7.3   fly orphan GPCR −05   ?
Related family with no specific orthologs (TF317550)
AH9.4   cholecystokinin receptor −06   ?
F54E4.2   fly cardioacceleratory peptide receptor −05   ?
Related family with no specific orthologs (TF318526)
C01F1.4   human neurokinin receptor −07   ?
F10D7.1   human galanin receptor −04   ?
H02I12.3   human adrenergic receptor −07   ?
FSHR ortholog (LRR repeats)
C50H2.1 fshr-1 FSH receptor −98   neurons, intestine
DmDopEcR ortholog
F59D12.1   fly DopEcR −53   ?
No obvious paralogs or orthologs
C02H7.2 npr-19 adrenergic receptor −12   ?
ZK813.5   fly Tachykinin-like receptor −10   ?
H23L24.4   fly FMRFamide receptor −09   ?
T02D1.4   fly FMRFamide receptor −08   ?
F36D4.4   human somatostatin receptor −07   ?
ZK721.4   fly CCK-like GPCR −07   ?
B0334.6   fly FMRFamide receptor −07   ?
Y34D9A.2 npr-23 human anaphylatoxin receptor −06   ?
C09F12.3   fly FMRFamide receptor −05   ?
F13H6.5   fly proctolin receptor −05   ?

This list was assembled using previously published accounts of putative neuropeptide receptors as a starting point (Keating et al., 2003; Wenick and Hobert, 2004; Janssen et al., 2010) and verifying these lists with BLAST searches. Groupings with vertebrate families was done as in (Cardoso et al., 2012). Additional genes were identified through clustering of gene families in TreeFam and paralogs assigned as presented on the Gene Summary pages in WormBase. In addition, genes identified in the srw gene family in Figure 2 of The putative chemoreceptor families of C. elegans were analzyed by BLAST. The InterPro domain IPR000276 (“GPCR, rhodopsin-like, 7TM”) contains 233 genes, most of which are clearly related to neuropeptide receptors; all of the 233 genes in this list were therefore BLAST-analyzed, and genes with scores lower than an arbitrary cutoff of E value =1e-04 were included in the list above.

1GPCR class indicates a commonly used classification scheme (Lagerstrom and Schioth, 2008) with class A being rhodopsin-like receptors and class B being secretin-like receptors (see text).

2BLASTP analysis of only Homo sapiens and Drosophila melanogaster database.

3From (Li and Kim, 2010).

4From www.wormbase.org

5The gene is not shown in the respective TreeFam tree, but was assigned into the family together with the other Treefam family members by paralogy assignment presented at the Gene Summary pages on WormBase.

Table 21: Representative analysis of srw genes reveals their relation to neuropeptide receptors

Gene1 Best hit e-value2 Expression3
srw-51 fly dromyosuppressin receptor −13 ?
srw-94 fly sex peptide receptor −12 ?
srw-33 fly sex peptide receptor −11 ?
srw-67 fly sex peptide receptor −10 ?
srw-29 fly peptide receptor GPCR −10 ?
srw-53 fly proctolin receptor −09 ?
srw-57 fly dromyosuppressin receptor −09 ?
srw-113 fly FMRFamide receptor −08 ?
srw-103 fly sex peptide receptor −08 neuronal
srw-44 fly dromyosuppressin receptor −08 ?
srw-42 fly sex peptide receptor −07 ?
srw-87 fly sex peptide receptor −07 ?
srw-122 fly FMRFamide receptor −07 ?
srw-42 fly sex peptide receptor −07 ?
srw-102 fly sex peptide receptor −06 ?
srw-115 fly FMRFamide receptor −06 ?
srw-118 fly FMRFamide receptor −06 hypodermis
srw-123 fly FMRFamide receptor −06 ?
srw-102 fly sex peptide receptor −06 ?
srw-73 fly proctolin receptor −05 ?
srw-127 human orexin receptor −05 ?
srw-139 fly dromyosuppressin receptor −05 subset of neurons
srw-8 human angiotensin II receptor −04 ?
srw-1 - above e-04 cutoff ?
srw-13 - above e-04 cutoff ?
srw-36 - above e-04 cutoff ?

1Representative members of several subbranch of srw gene family members (as shown in Figure 2 of The putative chemoreceptor families of C. elegans were analyzed.

2BLASTP analysis of only Homo sapiens and Drosophila melanogaster database.

3From www.wormbase.org

Table 22: Adhesion-type GPCRs (5 genes)

Gene Homolog Domains Expression
fmi-1 Flamingo/Starry Night/CELSR Cadherin + EGF + LamG + HormR + GPS + 7TMR (secretin-type) neurons
lat-1 Latrophilin SUEL Lectin + HormR + DUF3497 + GPS + 7TMR (secretin-type) neurons, pharynx, reproductive tissues
lat-2 pharynx, excretory cell
mth-1 Drosophila methuselah-like1 None in N-terminus + GPS + 7TMR (secretin-type) ?
mth-2 ?

1This homology is not apparent by BLAST searches but is only picked out in the Panther database, PTHR12011

Table 23: Downstream of GPCRs (83 genes)

  Gene Notes on domain structure Expression1
Gα (21 genes) gsa-1, egl-30, goa-1, gpa-1, gpa-2, gpa-3, gpa-4, gpa-5, gpa-6, gpa-7, gpa-8, gpa-9, gpa-10, gpa-11, gpa-12, gpa-13, gpa-14, gpa-15, gpa-16, gpa-17, odr-3  
  • 2/21 non-neuronal

  • 9/21 exclusively some neurons

  • 10/21 neurons + non-neurons

Gβ (2 genes) gpb-1   ubiquitous
gpb-2   ubiquitous
Gγ (2 genes) gpc-1   subset of neurons
gpc-2   ubiquitous
RGS family (21 genes) axl-1   ?
eat-16   broad neuronal, muscle
egl-10   broad neuronal, muscle
grk-1   ?
grk-2   neurons, muscle
pry-1   neurons, muscle
rgs-10   broad neuronal
rgs-11   broad neuronal
rgs-2   neurons, muscle
rgs-3   subset of neurons
rgs-4   ?
rgs-5   neurons, muscle
rgs-6   broad neuronal
rgs‐1   neurons
rgs‐7   neurons, muscle
rgs‐8.1   neurons
rgs‐8.2   ?
rgs‐9   broad
rhgf-1   neurons
snx-13   ?
snx-14   ?
GRK (2 genes) grk-1   intestine
grk-2   neurons
GPR/GoLoco (3 genes) gpr-1 (ags-3.2)   all mitotically dividing cells
gpr-2 (ags-3.3)   all mitotically dividing cells
ags-3   neurons, intestine, muscle
Gα GEF ric-8   neurons
  • Arrestins β: arr-1

  • β: all others (31 genes)

arr-1 N+C domain  
arrd-1 N+C domain ?
arrd-2 N+C domain ?
arrd-3 N+C domain ?
arrd-4 N+C domain subset of sensory neurons
arrd-5 N+C domain ?
arrd-6 2x C domain subset of neurons
arrd-7 N+C domain ?
arrd-8 N+C domain ?
arrd-9 N+C domain ?
arrd-10 N domain only ?
arrd-11 N+C domain ?
arrd-12 N domain only ?
arrd-13 N+C domain ?
arrd-14 N+C domain ?
arrd-15 N+C domain subset of neurons
arrd-16 N+C domain ?
arrd-17 N+C domain ?
arrd-18 N+C domain ?
arrd-19 N+C domain ?
arrd-20 N domain only ?
arrd-21 N+C domain ?
arrd-22 N+C domain ?
arrd-23 N+C domain ?
arrd-24 N+C domain ?
arrd-25 N+C domain ?
arrd-26 N+C domain ?
arrd-27 N only ?
arrd-28 C only ?
rnh-1.2 N + RnaseH ?
ttm-2 N only ?

1From www.wormbase.org

Table 24: Making and breaking cGMP - Guanylyl cyclases and phosphodiesterases (40 genes)

Type Domains1 Genes Expression2
Receptor-type GCY (27 genes) ANF receptor domain + TM + PK + Cyc daf-11, odr-1, gcy-1, gcy-2, gcy-3, gcy-4, gcy-5, gcy-6, gcy-7, gcy-8, gcy-9, gcy-12, gcy-13, gcy-14, gcy-15, gcy-17, gcy-18, gcy-19, gcy-20, gcy-22, gcy-23, gcy-25, gcy-28, gcy-29
  • Non-neuronal only: 2/27

  • Nervous system: 25/27

  • Nervous system only: 21/27

  • Sensory neurons + other neurons: 25/27

  • Sensory neurons only: 15/27

  • Single neuron class-specific: 9/27

Ex. domain + TM + PK + Cyc gcy-11, gcy-21
TM + PK + Cyc gcy-27
Soluble GCY (7 genes) HNOB + Cyc gcy-31 through gcy-37 7/7 in sensory neurons only
cGMP-specific PDEs (4 genes) PDE pde-1 (hPDE1 ortholog) neurons
PDE + GAF pde-2 (hPDE2 ortholog) neurons, muscle, pharynx, intestine
PDE (no TM) pde-3 (hPDE3 ortholog) neurons, hypodermis
PDE + GAF pde-5 (hPDE10 ortholog) ?
cAMP-specific PDEs (2 genes) PDE pde-4 (hPDE4 ortholog) ?
PDE + PAS pde-6 (hPDE8 ortholog) ?

1ANF receptor domain = “IPR001828 Extracellular ligand-binding receptor”. This domain can also be found in metabotropic GABA and Glu receptors and in ionotropic Glu receptors. TM = transmembrane. PK = protein kinase-like. Cyc = guanylyl cyclase. HNOB = heme nitric oxide binding domain. Data taken from Ortiz et al. (Ortiz et al., 2006), but domains have been reanalyzed. Orthology relationship of pde genes was established by Treefam and in (Conti and Beavo, 2007).

2From www.wormbase.org

Table 25: Receptors of CO2 and O2 (39 genes plus 7 soluble GCY genes)

Type Gene Expression pattern1
Carbonic anhydrase2 cah-1 neurons
cah-2 subset of neurons
cah-3 subset of neurons, intestine
cah-4 neurons, excretory cell
cah-5 subset of neurons, intestine
cah-6 subset of neurons
Globin glb-1 subset of neurons, muscle or hypodermis
glb-2 subset of neurons
glb-3 subset of neurons
glb-4 subset of neurons
glb-5 subset of neurons (oxygen sensory neurons URX, AQR/PQR, BAG)2
glb-6 subset of neurons
glb-7 subset of neurons
glb-8 muscle
glb-9 subset of neurons
glb-10 subset of neurons, enriched at synapse3
glb-11 subset of neurons
glb-12 subset of neurons
glb-13 subset of neurons
glb-14 subset of neurons, vulval muscle
glb-15 no observable expression
glb-16 subset of neurons
glb-17 subset of neurons
glb-18 subset of neurons
glb-19 subset of neurons
glb-20 subset of neurons, muscle
glb-21 subset of neurons, pharynx
glb-22 subset of neurons
glb-23 subset of neurons
glb-24 subset of neurons
glb-25 subset of neurons
glb-26 head mesodermal cell, stomato-intestinal muscle
glb-27 subset of neurons
glb-28 subset of neurons
glb-29 subset of neurons
glb-30 subset of neurons
glb-31 subset of neurons
glb-32 subset of neurons
glb-33 subset of neurons

1Mostly from reporter analysis done by (Hoogewijs et al., 2008).

2Convert CO2 into bicarbonate. Expression of carbonic anhydrase is generally considered to be a hallmark of CO2 responsive neurons (Bretscher et al., 2011).

3From (Sieburth et al., 2005).

Table 26: Synaptic vesicle proteins and their homologs (57 genes)

Overall type Homolog Gene (alt. name) Expression1
Calcium sensor for vesicle release synaptotagmin (7 genes) snt-1 broad neuronal
snt-2 ?
snt-3 ?
snt-4 broad neuronal
snt-5 ?
snt-6 ?
snt-7 ?
R-SNARE VAMP/synaptobrevin (9 genes) snb-1 ubiquitous
snb-2 ?
snb-5 ?
snb-6 ?
snb-7 ?
vamp-7 ?
vamp-8 broad neuronal
sec-22 muscle, reproductive system
ykt-6 ?
Q SNARE2 Qa subtype (10 genes) unc-64 (syx-1) neurons
syx-2 neurons
syx-3 intestine
syx-4 gonad
syx-5 neurons, muscle
syx-7 ?
syx-16 ?
syx-17 ?
syx-18 ?
Qb subtype (5 genes) gos-28 seam cells, intestine
memb-1 ubiquitous
memb-2 ?
sec-20 ?
vti-1 ?
Qc subtype (3 genes) syx-6 neuron, intestine
nbet-1 ?
use-1 ?
Qb/c subtype (3 genes) ric-4 (snap-25) broad neuronal
aex-4 only intestine
snap-29 broad
tetraspan vesicle proteins (TVPs) synaptogyrin sng-1 neurons
synaptophysin sph-1 muscle
SCAMP scm-1 subset of neurons
Other vesicle associated or regulatory proteins synapsin snn-1 broad neuronal
SVOP svop-1 ?
CSP alpha dnj-14 ?
Rab3 rab-3 panneuronal
rabphilin rbf-1 neurons
Rim unc-10 neurons
Rim-binding protein elks-1 broad neuronal
Munc-18 unc-18 neurons
T07A9.103 ?
Munc-13 unc-13 broad neuronal
synaptojanin unc-26 broad neuronal
endophilins unc-57 broad neuronal
erp-1 broad neuronal, muscle
complexin cpx-1 broad neuronal
cpx-2 subset of neurons
tomosyn tom-1 neurons
CAPS unc-31 broad neuronal
IA2 ida-1 subset of neurons

1From www.wormbase.org

2Subtype classification from http://bioinformatics.mpibpc.mpg.de/snare/snareQueryPage.jsp

3Possibly generated by unc-18 duplication, more similar to unc-18 than to any other member of the Sec1 superfamily, of which unc-18 is a member.

Table 27: PDZ domain proteins (70 genes)

  Gene (alt. name) Domain structure Homolog Expression1
PDZ only (31 genes) mpz-1 PDZ only (10) MPDZ neurons
par-3 PDZ only (3) Bazooka ?
C01F6.6/mpz-2 PDZ only (2) PDZK1 pharynx, intestine, excretory cell
mpz-3 PDZ only (2) none ?
mpz-4 PDZ only (2) none ?
mpz-5 PDZ only (2) none ?
mpz-6 PDZ only (2) none ?
Y42H9AR.1 PDZ only (2) GRASP65 ?
T21G5.4 PDZ only (2) paralog of C25G4.6 ?
C25G4.6 PDZ only (2) paralog of T21G5.4 ?
gipc-1 PDZ only (1) GIPC intestine
gipc-2 PDZ only (1) GIPC intestine, neurons
gopc-1 PDZ only (1) GOPC/CAL/PIST ?
gras-1 PDZ only (1) GRASP ?
mics-1 PDZ only (1) many ?
C09G1.4 PDZ only (1) none ?
C46H11.6 PDZ only (1) none muscle
C50D2.3 PDZ only (1) none ?
C52A11.3 PDZ only (1) none ?
F23C8.13 PDZ only (1) none ?
F40F9.3 PDZ only (1) none ?
T15H9.4 PDZ only (1) none ?
ZK849.1 PDZ only (1) none ?
Y52E8A.1 PDZ only (1) none ?
C01B7.5 PDZ only (1) none ?
T19B10.5 PDZ only (1) Periaxin(?) ?
psmd-9 PDZ only (1) PSMD9 ?
C45G9.7 PDZ only (1) tax1BP3 ?
F16G10.5 PDZ only (1) none ?
F20D6.1 PDZ only (1) none ?
mics-1 PDZ only (1) Magix ?
MAGUK type (8 genes) dlg-1 PDZ, SH3, GuKc DLG epithelial, neurons
lin-2 PDZ, SH3, GuKc CASK neurons
magu-1 PDZ, SH3, GuKc MAGUK family (MPP3) ?
magu-2 PDZ, SH3, GuKc MAGUK family (MPP5) ?
magu-3 PDZ, SH3, GuKc MAGUK family (MPP6) pharynx, intestine
magu-4 PDZ, GuKc TJP1/2 hypodermins, neurons
zoo-1 PDZ, SH3, GuKc ZO1/TJP3 hypoderm, muscle
magi-1 PDZ, WW, GuKc MAGI neurons
frm-5.1 B41, PDZ FERMPD ?
frm-5.2 B41, PDZ frm-5 paralogue (duplication) ?
frm-8 WW, PDZ, B41 FERMPD neurons
ptp-1 B41, FERM, PDZ, PTPc   ?
Others (27 genes) cnk-1 SAM, PDZ, PH CNKSR ?
kin-4 S/T kinase, PDZ none ?
sipa-1 RapGAP, PDZ SIPA1 ?
afd-1 RA, FH, PDZ afadin ?
lin-10 PTB, PDZ LIN10 very broad
alp-1 PDZ, ZM, LIM ALP/Enigma muscle, head neurons
ZK1321.4 PDZ, ZM none ?
nab-1 PDZ, SAM neurabin neurons, hypodermis
snx-27 PDZ, PX SNX27 ?
stn-2 PDZ, PH syntrophin neurons, muscle
stn-1 PDZ, PH syntrophin neurons, muscle
lim-8 PDZ, LIM none muscle
syd-1 PDZ, C2, RhoGAP SYD neurons
F45E4.3 PDZ, C2 none ?
unc-10 PDZ, C2 Rim1 neurons
rhgf-1 PDZ, C1, RhoGEF, PH ARHGEF neurons
C53B4.4 PDZ, C1 none ?
Y57G11C.22 PDZ, Arfaptin Arfaptin ?
par-6 PDZ, PB1   adult ?
let-413 LRR, PDZ Erbin, Scribble intestine, hypodermis, pharynx
lap-1 LRR, PDZ LAP intestine
lin-7 L27, PDZ    
mig-5 DAX, PDZ, DEP dishevelled embryo broad, neurons
dsh-2 DAX, PDZ dishevelled neurons, intestine
dsh-1 DAX, DEP, PDZ dishevelled neurons, pharynx
pxf-1 cNMP, RasGEF, PDZ RAPGEF neurons
shn-1 ANK, PDZ Shank ?

1From www.wormbase.org

Table 28: Gap junction proteins – the innexins (25 genes)

Genes Expression1
inx-1, inx-2, inx-3, inx-4, inx-5, inx-6, inx-7, inx-8, inx-9, inx-10, inx-11, inx-12, inx-13, inx-14, inx-17, inx-18, inx-19/nsy-5, unc-7, unc-9 neuronally expressed innexins
inx-15, inx-16, inx-20, inx-21, inx-22, eat-5 non-neuronally expressed innexins

1Assembled from (Altun et al., 2009).

Table 29: Kinesin-like Motor Proteins (21 genes)

Family1 Family Function2 Gene (alt. name) Homolog Expression3
Kinesin-1 Vesicle, organelle and mRNA transport unc-116 conventional broad
Kinesin-2 Vesicle and intraflagellar transport osm-3 (klp-2) KIF17 sensory neurons
klp-11 KIF3B,B sensory neurons
klp-20 KIF3A many neurons
Kinesin-3 Vesicle transport unc-104 (klp-1) KIF1A/KIF1B neurons
klp-4 KIF13A/B, KIF14 neurons, pharynx, intestine
klp-6 KIF13A/B, KIF14 neurons
Kinesin-4 Chromosome positioning klp-12 KIF21A,B dividing cells
klp-19 KIF4A dividing cells
Kinesin-5 Spindle pole separation, bipolarity bmk-1 (klp-14) BimC dividing cells
Kinesin-6 Central spindle assembly, cytokinesis zen-4 (klp-9) KIF20A,B, KIF23 dividing cells
Kinesin-12 Spindle pole organization klp-10 KIF15 ?
klp-18 KIF15 germline
Kinesin-13 chromosome segregation klp-7 KIF2A,B,C ?
Kinesin-14 Spindle pole organization and cargo transport klp-3 KIFC2, KIFC3 pharynx
klp-15 KAR3 ?
klp-16 KAR3 ?
klp-17 KAR3 ubiquitous
atypical   vab-8 (klp-5)   neurons, muscle
klp-8   neurons, hypodermis, excretory
klp-13   neurons

1According to (Hirokawa et al., 2010).

2According to (Verhey and Hammond, 2009).

3From www.wormbase.org

Table 30: Dynein motors (17 genes)

Class Gene (alt. name) Adult expression1
Heavy chain dhc-1 broad
che-3 (dhc-2) ciliated neurons
dhc-3* neurons
dhc-4* neurons
Intermediate chain dyci-1 muscle, gonad
light intermediate chain xbx-1 ciliated neurons
dli-1 neurons, hypodermis, pharynx
Light chain (LC8-type) dlc-1 broad
dlc-2 neurons, intestine, muscle
dlc-3 muscle
dlc-4 ?
dlc-5 pharynx
dlc-6 ?
Light chain (Tctex1-type) dylt-1 broad
dylt-2 (xbx-2) ciliated neurons
dylt-3 pharynx
Light chain (Roadblock-type) dyrb-1 broad

*While dhc-1 and che-3 contain the typical DHC N1 and N2 domains, dhc-3 and dhc-4 only contain N2 domains.

1From www.wormbase.org

Table 31: Myosin Motors (18 genes)

Gene (alt. name) Adult Expression1
hum-1 neurons, intestine, muscle
hum-2 ?
spe-15 (hum-3) ?
hum-4 ?
hum-5 reproductive system
hum-6 ?
hum-7 dividing cells
hum-8 ?
hum-9 ?
hum-10 ?
nmy-1 neurons
nmy-2 neurons, intestine
myo-1 pharynx
myo-2 pharynx
myo-3 muscle
unc-54 (myo-4) muscle
myo-5 pharynx
myo-6 ?

1From www.wormbase.org

Table 32: Sensory cilia transport (35 genes)

IFT modules/components Gene name Description
Kinesin-II klp-11 motor
klp-20 motor
kap-1 auxiliary
OSM-3 Kinesin osm-3 motor
IFT-Dynein che-3 Heavy chain
xbx-1 Light intermediate chain
dyci-1 Intermediate chain
xbx-2 Light chain
IFT-A (5 genes) dyf-2 IFT144
che-11 IFT140
ZK328.7 IFT139
daf-10 IFT122
ifta-1 IFT121
IFT-B (14 genes) osm-1 IFT172
osm-5 IFT88
ift-81 IFT81
che-2 IFT80
ift-74 IFT74/72
dyf-1 IFT70
che-13 IFT57/55
dyf-11 IFT54
osm-6 IFT52
dyf-6 IFT46
ifta-2 IFT22
Y110A7A.20 IFT20
dyf-3 Qilin
BBSome (8 genes) bbs-1 BBS1
bbs-2 BBS2
arl-6 BBS3
bbs-4 BBS4
bbs-5 BBS5
osm-12 BBS7
bbs-8 BBS8
bbs-9 BBS9

The list has been adapted from (Inglis et al., 2009). Expression of all examined genes is observed in sensory neurons (www.wormbase.org).

Table 33: Extracellular Immunoglobulin (Ig) and Leucine rich repeat (LRR) domain-containing proteins (93 genes)

Protein family Gene name2 Domains Expression3
Ig domain proteins1 (56 genes) cam-1 (ROR) Ig + Frz + Kr + TyrKinase neurons
  egl-15 (FGFR) 3 Ig + TM + TyrKinase hypodermis
  ver-1 5 Ig + TM + TyrKinase neurons, muscle
  ver-3 4 Ig + TM + TyrKinase neurons, muscle
  ver-4 4 Ig + TM + TyrKinase ?
  clr-1 1 Ig + 2 Fn3 + TyrPhosphatase ?
  ptp-3 (LAR) 3 Ig + 9 Fn3 + TyrPhosphatase muscle, neurons
  ptp-4 1 Ig + 3 Fn3 + TyrPhosphatase ?
  dig-1 many domains (extracell. matrix) hypodermis, mesoderm
  him-4 many domains (extracell. matrix) muscle
  unc-52 (perlecan) many domains (extracell. matrix) muscle
  igcm-1 7 Ig + 2 Fn3 + TM neurons, muscle, seam
  igcm-2 3 Ig + 2 Fn3 + TM some neurons, intestine
  igcm-3 3 Ig + TM hypodermis muscle
  igcm-4 3 Ig + TM ?
  mig-6 Ig + TSP1 + KU muscle
  oig-1 1 Ig (secreted) some neurons
  oig-2 1 Ig (secreted) some neurons
  oig-3 1 Ig (secreted) some neurons, pharynx
  oig-4 1 Ig (secreted) muscle
  oig-5 1 Ig (secreted) ?
  oig-6 1 Ig + TM ?
  oig-7 1 Ig + TM ?
  oig-8 1 Ig + TM ?
  rig-1 6 Ig + 2 Fn3 + TM neurons, muscle, hypodermis
  rig-3 4 Ig + GPI neurons
  rig-4 6 Ig + 13 Fn3 + TM neurons, muscle, hypodermis
  rig-5 3 Ig + GPI neurons, gut
  rig-6 6 Ig + 4 Fn3 + GPI neurons, muscle
  ncam-1 5 Ig + 1 Fn3 + TM neurons, gut
  sax-3 (Robo) 6 Ig + 3 Fn3 + TM broad neuronal
  sax-7 (L1) 6 Ig + 5 Fn3 + TM very broad
  lad-2 (L1) 6 Ig + 5 Fn3 + TM some neurons
  syg-1 5 Ig + TM neurons, muscle
  syg-2 8 Ig + 1 Fn3 + TM neurons, muscle
  unc-40 (DCC) 4 Ig + 6 Fn3 + TM broad neuronal
  unc-5 Ig + TSP1 + TM + ZO1 + DEATH some neurons
  wrk-1 3 Ig + 1 Fn3 + GPI some neurons, gut
  zig-1 2 Ig + TM broad neuronal
  zig-2 2 Ig (secreted) subset of neurons
  zig-3 2 Ig (secreted) subset of neurons
  zig-4 2 Ig (secreted) subset of neurons
  zig-5 2 Ig (secreted) subset of neurons
  zig-6 2 Ig (secreted) subset of neurons
  zig-7 2 Ig (secreted) subset of neurons
  zig-8 2 Ig (secreted) subset of neurons
  zig-9 2 Ig (secreted) ?
  zig-10 2 Ig + TM ?
  igeg-1 Ig + EGF + TM ?
  igeg-2 Ig + EGF + TM ?
  igdb-1 1 Ig + 4 Fn3 + 2 DB + TM ?
  igdb-2 2 Ig + 5 Fn3 + 4 DB + TM hypodermis
  igdb-3 1 Ig + 1 Fn3 + 2 DB ?
  madd-4 Ig + many TSP1 (secreted) neurons
    3 Ig + 6 Fn3 (secreted) ?
    Sushi domains + 1 Ig (paralog of and adjacent to lev-9) ?
Ig + LRR proteins (6 genes) pxn-1 (Peroxidasin) LRR + Ig + peroxidase (secreted) ?
  pxn-2 (Peroxidasin) LRR + Ig + peroxidase (secreted) neurons, hypodermis
  sma-10 LRRs + Ig + TM hypodermis, intestine, pharynx
  iglr-1 LRRs + Ig + TM neurons in head and VNC.
  iglr-2 LRRs + Ig + TM ?
  iglr-3 LRRs + Ig (secreted) ?
eLRR proteins (23 genes) fshr-1 (FSHR) LRRs + 7TMR neurons, intestine
  slt-1 (SLT) LRRs + EGF + LamG epidermis, muscle
  tol-1 LRRs + TM + TIR neurons, epithelial
  pan-1 LRRs + TM hypodermis, pharynx, head muscles
  let-4 (sym-5) LRRs + TM ?
  egg-6 LRRs + TM hypodermis, pharynx
  dma-1 LRRs + TM subset of neurons
  lron-1 LRRs + TM pharynx
  lron-2 LRRs + TM pharynx
  lron-3 LRRs + TM ?
  lron-4 LRRs + TM ventral nerve cord
  lron-5 LRRs + TM Many neurons in head, VNC, and tail.
  lron-6 LRRs + TM Many neurons in nerve ring and VNC.
  lron-7 LRRs + TM intestine
  lron-8 LRRs + TM hypodermis, pharynx, muscle
  lron-9 LRRs + TM head, VNC neurons, seam cells, muscles
  lron-10 LRRs + TM ?
  lron-11 LRRs + TM Pharynx, hypodermis
  lron-12 LRRs + GPI ?
  lron-13 LRRs + GPI ?
  lron-14 LRRs + GPI Numerous neurons in head and VNC
  lron-15 LRRs (secreted) ?
  sym-1 LRRs (secreted) ?

Proteins were identified from searches of the SMART and InterPro domain databases.

Intracellular proteins with Ig or LRR domains are excluded from the list and were identified either by the presence of other domains known have cytoplasmic function or based on the absence of a detectable signal sequence, as assessed by SignalP.

1The cutoff for inclusion in this list is somewhat arbitrary since Ig domains can significantly degenerate, making them somewhat difficult to predict (example, T17A3.10; this gene may in fact be fused to ver-2).

2Some well-known vertebrate orthologs listed in parenthesis.

3From www.wormbase.org except eLRR expression patterns which are from (Liu and Shen, 2011).


Table 34: Cadherins (13 genes)

Gene (alt. name) Homology Expression1
hmr-1 classic neurons
cdh-3 fat-like epidermis
cdh-4 fat-like neurons, epithelial
fmi-1 (cdh-6) flamingo-like neurons
casy-1 (cdh-11) calsyntenin-like neurons, intestine
cdh-1 dachsous-like ?
cdh-5 nematode-specifc ?
cdh-7 nematode-specifc ?
cdh-8 nematode-specifc ?
cdh-9 nematode-specifc ?
cdh-10 nematode-specifc epithelial cells
cdh-12 nematode-specifc ?
Y37E11AL.6 nematode-specifc (Cdh domain & EGF domain) ?

1From www.wormbase.org

Table 35: Neurexin superfamily and neurexin ligands (8 genes)

  Gene Homolog Expression1
Neurexin superfamily nrx-1 classic neurexin broad neuronal
itx-1 CASPR-like glia, intestine
nlr-1 CASPR-like subset of neurons
bam-2 divergent neurons, epidermis
Neurexin ligands lat-1 latrophilin intestine, neurons, muscle
lat-2 latrophilin pharynx, excretory
nlg-1 neuroligin neurons, muscle
glit-1 gliotactin (neuroligin-like) ?
-* LRRTM1,2
- Cbln1
- Neurexophilins

*Even though no obvious orthologs can be found, proteins with similar domain architecture are encoded in the genome.

1From www.wormbase.org

15. Acknowledgements

Work in my laboratory is funded by the National Institutes of Health (R01NS039996-05; R01NS050266-03) and the Howard Hughes Medical Institute. I thank Jonathan Hodgkin for discussion and his involvement in gene naming and Michael Koelle, Jim Rand, Thomas Boulin, Ines Carrera, Jeremy Dittman, Martin Chalfie, Niels Ringstad, Piali Sengupta, Chris Li, Iva Greenwald and especially Erik Jorgensen for comments on the manuscript.


Almedom, R.B., Liewald, J.F., Hernando, G., Schultheis, C., Rayes, D., Pan, J., Schedletzky, T., Hutter, H., Bouzat, C., and Gottschalk, A. (2009). An ER-resident membrane protein complex regulates nicotinic acetylcholine receptor subunit composition at the synapse. Embo J. 28, 2636-2649. Abstract Article

Altun, Z.F., Chen, B., Wang, Z.W., and Hall, D.H. (2009). High resolution map of Caenorhabditis elegans gap junction proteins. Dev. Dyn. 238, 1936-1950. Abstract Article

Alvarez, C.E. (2008). On the origins of arrestin and rhodopsin. BMC Evol. Biol. 8, 222. Abstract Article

Anderson, J.F., and Ultsch, G.R. (1987). Respiratory gas concentrations in the microhabitats of some Florida arthropods. Comp. Biochem. Physiol. 88A, 585588. Article

Askwith, C.C., Cheng, C., Ikuma, M., Benson, C., Price, M.P., and Welsh, M.J. (2000). Neuropeptide FF and FMRFamide potentiate acid-evoked currents from sensory neurons and proton-gated DEG/ENaC channels. Neuron 26, 133-141. Abstract Article

Aubry, L., Guetta, D., and Klein, G. (2009). The arrestin fold: variations on a theme. Curr. Genomics 10, 133-142. Abstract Article

Aurelio, O., Hall, D.H., and Hobert, O. (2002). Immunoglobulin-domain proteins required for maintenance of ventral nerve cord organization. Science 295, 686-690. Abstract Article

Bargmann, C.I. (1998). Neurobiology of the Caenorhabditis elegans genome. Science 282, 2028-2033. Abstract Article

Bargmann, C.I. Chemosensation in C. elegans (October 25, 2006), WormBook, ed. The C. elegans Research Community, WormBook, doi/10.1895/wormbook.1.123.1, http://www.wormbook.org.