Token-Curated Registry with Citation Graph

In this study, we aim to incorporate the expertise of anonymous curators into a token-curated registry (TCR), a decentralized recommender system for collecting a list of high-quality content. This registry is important, because previous studies on TCRs have not specifically focused on technical content, such as academic papers and patents, whose effective curation requires expertise in relevant fields. To measure expertise, curation in our model focuses on both the content and its citation relationships, for which curator assignment uses the Personalized PageRank (PPR) algorithm while reward computation uses a multi-task peer-prediction mechanism. Our proposed CitedTCR bridges the literature on network-based and token-based recommender systems and contributes to the autonomous development of an evolving citation graph for high-quality content. Moreover, we experimentally confirm the incentive for registration and curation in CitedTCR using the simplification of a one-to-one correspondence between users and content (nodes).


INTRODUCTION
For many blockchain-based decentralized applications (DApps), one of the challenges is the reliability of information originating from an off-chain environment. This is because the Bitcoin protocol [24], which is the origin of DApps and has a novelty of building a reliable consensus among anonymous users (on a public peer-to-peer network), only computes information generated from an on-chain environment (i.e., transaction records of Bitcoin). For example, consider the case of a simple DApp that provides alerts when it rains in a given location. In this case, while the DApp can ensure the on-chain state transition leads to an alert, it cannot ensure the off-chain fact (used as the trigger) that it has actually rained at the location. Therefore, most DApps rely on trusted third parties, such as the National Weather Service, for their input 1 . This is in contrast to the Bitcoin protocol, which functions even if the operators of each node are unknown. Consequently, DApps require an additional protocol in which anonymous users can build reliable consensus on off-chain information to maintain the novelty of the Bitcoin protocol.
A token-curated registry [11,12] (TCR) is a DApp for establishing such a protocol. It specializes in compiling a high-quality, reliable list of off-chain content (e.g., restaurants, universities, and webpages) as a recommender system 2 . Although there are different design patterns among existing TCRs [20], generally their consensus building is based on a token-staking scheme in which all users can stake their tokens on a binary choice {accept, reject} as curators whenever an applicant posts new content to the list 3 . Consensus is the selection that obtains more tokens compared to another selection after a certain period. Moreover, all staked tokens are redistributed among curators who stake their tokens on the consensus side, i.e., token staking intends to yield informative reports from anonymous curators who risk losing their tokens as well as the token price, which is assumed to fluctuate with the quality of the list. One limitation is that token staking does not reflect expertise in consensus (a) Existing TCRs (b) CitedTCR Fig. 1. TCRs in both cases select curators to decide whether to accept a newly proposed content x into the list of off-chain content {A, B, · · · }. However, while existing TCRs (a) manage an unstructured list V t that can be curated by any user who stakes certain number of tokens, CitedTCR (b) manages an evolving directed acyclic graph (DAG)-structured list G t (V t , E t ) whose curators are assigned according to citation relationships.
building because, regardless of specialty, any user with certain number of tokens can participate in the curation. Therefore, the reliability of consensus is restricted under TCRs, which primarily depend only on token staking, particularly when the off-chain content is technical (e.g., academic papers and patents) and requires expertise in specific fields for effective curation. Accordingly, in this study, we aim to incorporate the expertise of anonymous curators into TCRs using a protocol called CitedTCR, which leverages a citation graph for curator assignment and uses a peer-prediction mechanism to compute the number of reward tokens paid to the curators. Fig. 1 illustrates the role of the citation graph in our protocol. Fig. 1 (a) shows that existing TCRs manage an evolving unstructured list (as a set) V t , in which an applicant posts new content (as an element) x and any user can be the curator of x because of token staking. However, Fig. 1 (b) shows that CitedTCR manages an evolving list G t (V t , E t ) with a citation graph (i.e., a DAG) structure, in which an applicant posts x and its out-edges (x, A), (x, B) point to existing nodes {A, B} as references. Moreover, curators are stochastically assigned to a given number of users who have posted nodes (e.g., {D, F , G}) that have both high similarity with x's reference nodes {A, B} and high centrality in G t 4 . CitedTCR assigns appropriate curators in a manner similar to the academic peer-review process, in which researchers who have produced high-quality papers with a large number of citations are more likely to be selected as reviewers in their field of expertise. Note that this form of curator assignment serves as an incentive for applicants to register high-quality content in CitedTCR because users may have more opportunities to obtain reward tokens as curators if their content in G t attracts a large number of citations 56 . The citation graph serves as a proxy for the expertise of anonymous curators; therefore, the reliability of G t from the perspective of both curation and registration is ensured.
Peer prediction is a mechanism of game theory for eliciting informative reports for tasks with no ground truth, such as the peer review of academic papers and online product reviews by consumers. In particular, peer prediction compares user reports for the same task to create a truthful (known as strategy-proof or incentive-compatible) environment, in which no user can obtain a higher utility by any possible strategy deviating from the user's true preferences [26]. CitedTCR uses peer prediction for reward computation, in which it is assumed that the assigned curators can obtain newly issued reward tokens if they return a binary signal {accept, reject} as a report for x and x's citation relationships 7 . This mechanism addresses two problems in the token-staking scheme, which is even more critical under CitedTCR. The first problem is the risk of strategic misreports (such as collusion) among curators. Although this has been discussed for existing TCRs [5,7], token staking becomes more vulnerable to this risk in CitedTCR because CitedTCR assigns a fixed number of homogeneous curators with similar expertise. The second problem is the lack of incentive to participate in consensus building because of the risk of losing staked tokens (see Appendix A). Note that strengthening the weak incentive of token staking is a common topic in TCRs [35]. Stronger incentives are particularly important for CitedTCR in which reports elicited from assigned curators are the key for reflecting expertise in G t . Therefore, rather than token staking, we use a peer-prediction mechanism that provides maximum (new) rewards for informative reports.
CitedTCR is thus a hybrid of token-based and network-based recommender systems because it recommends both V t (G t ) curated by tokens and curators assigned according to G t . In this study, as a first step of this hybrid approach, we used the Personalized PageRank [15] (PPR) algorithm for curator assignment, and a peer prediction mechanism called DG13 proposed by Dasgupta and Ghosh [6] for reward computation. In addition to their popularity, both PPR and DG13 have several favorable properties for CitedTCR as demonstrated in Sections 2 and 3. Moreover, we assume that users in this study have a one-to-one correspondence with V t (G t ). This assumption is intended to simplify the curation process into a state transition in G t ; its details are discussed in Section 3.
The remainder of this paper is organized as follows. In Section 2, we introduce related studies and contributions from the perspective of three components: TCR, the PPR algorithm, and a peer-prediction mechanism. In Section 3, we describe the specification of CitedTCR, including the role of PPR and DG13. In Section 4, we examine the practical utility of our proposal using two step-wise simulations with the citation graph of academic papers. Finally, in Section 5, we concludes the paper with a summary of achievements and remaining concerns.

RELATED WORK 2.1 TCR
Since Goldin [11,12] proposed the initial design in 2017, TCRs have been implemented in a number of applications such as the adChain registry 8 for webpages, the Ocean protocol 9 for user reputations, and the Civil registry 10 for news articles. Because TCR is a recent development, most discussion at present focus on blog articles whose topics vary from the classification of design patterns [20] to critical examinations of token staking [3,5]. A reading list curated by the blockchain community [22,29] would be helpful for summarizing this discussion. In addition to blog articles, TCRs have been examined in academic papers, primarily from a game-theoretic perspective. For example, Asgaonkar and Krishnamachari [2] presented a mathematical foundation of the TCR 1.1 model 5 We will confirm the strength of this incentive in Section 4. 6 As a similar incentive, TCRs using the token staking often require applicants to stake a certain amount of their token on {accept} choice. 7 As described in Section 3, x is listed including its citation relationship if the number of {accept} reports exceeds a given threshold. 8 https://metax.io/en/products/adchain_registry/, (accessed April 20, 2019) 9 https://oceanprotocol.com/, (accessed April 20, 2019) 10 https://registry.civil.co/registry/approved, (accessed April 20, 2019) [12] to determine the sufficient conditions for each consensus at equilibrium. Wang and Krishnamachari [35] introduced enhanced token staking with a new issuance of reward tokens to create an incentive to participate in consensus building. Moreover, Falk and Tsoukalas [7] used an axiomatic approach to demonstrate the limitations of a token-staking scheme, in which the expected rewards are proportional to the amount of staking.
As mentioned in Section 1, in this study, we aim to incorporate the expertise of anonymous curators into TCRs using a combination of citation graphs and peer prediction (i.e., PPR and DG13). This approach is novel because previous studies and blog articles on TCRs have not explicitly addressed the mechanism for technical content, such as academic papers and patents, whose effective curation requires expertise in relevant fields.

The PPR algorithm
The PPR [15] algorithm, originally named topic-sensitive PageRank, is an extension of the PageRank [4,27] algorithm and computes a score of importance for each node from the viewpoint of the entire network structure. While the PageRank score originates from a random walk on the network, PPR allows this random walk to return to the predetermined set of nodes with a given probability 11 , thereby adapting the score to recommender systems (see Section 3.2 for details). In many recommender systems using PPR, CitedTCR is most closely related to the PaperRank algorithm proposed by Gori and Pucci [13], which applies PPR to a citation graph of academic papers to generate useful paper-to-paper recommendations. Moreover, PPR is a component of several paper-to-reviewer assignment systems [18,19] that attempt to recommend appropriate peer reviewers for a submitted paper.
From the perspective of PPR, this study provides contributions such as CitedTCR bridging the literature on network-based and token-based recommender systems for the first time to strengthen the reliability of the consensus. New economy movement (NEM) [25] is a representative precedent of blockchain-based protocols that leverage a network structure for consensus building. However, NEM is not specific to TCRs and manages on-chain transaction records using a network-based score different from that of PPR.

Peer-prediction mechanism
Peer prediction was first introduced by Miller et al. [23] as an application of the proper scoring rule [9] and game theory 12 . To model the problem of eliciting private information, reward (score) computation assumes an environment in which each user reports probabilistic but correlated signals based on the assigned tasks. As examined by Jurca and Faltings [17], a common problem in the mechanism proposed by Miller et al. and subsequent mechanisms is that the computation has multiple Nash equilibria, including uninformative ones in which elicited reports are independent of the true signals 13 ; e.g., the same signals or random signals are always reported to avoid the effort of observation. As a solution to this problem, Dasgupta and Ghosh [6] proposed a multi-task peer-prediction mechanism called DG13 that assigns multiple tasks to one user and computes rewards for one task using the reports produced for other tasks. Under the assumption of positively correlated binary signals, DG13 ensures strong truthfulness [32], in which an equilibrium by informative reports has the highest rewards among other realistic equilibria (see Section 3.3 for details). CitedTCR uses DG13 because the abovementioned properties of multi-tasking, strong truthfulness, and binary signals are compatible with the general settings of TCRs, in which curators evaluate multiple content using binary choices.
To our knowledge, CitedTCR is the first proposal that uses a peer-prediction mechanism in TCRs. This proposal presents an approach that can overcome the aforementioned two problems in the token staking. In addition to DG13, recent studies on peer prediction have discussed topics relevant to TCRs. For example, Agarwal et al. [1] proposed a multi-task mechanism that assigns appropriate tasks to heterogeneous users (with various propensities) based on accumulated reports. This can contribute to TCRs with expertise as an approach different from citation graphs. Goel et al. [10] assessed the robustness of a peer-prediction mechanism for the case in which an incentive for misreporting exists outside the system with an application to decentralized oracles 14 . Their assessment can be applicable to TCRs with a design similar to that of decentralized oracles.

MODEL
In this section, we describe the specification of CitedTCR as a state transition closed on list G t . This simplification, achieved by several assumptions, including the aforementioned one-to-one correspondence, is useful for an algorithmic expression and for the experimental simulations described in Section 4. Moreover, we present details of PPR and DG13 that clarify how these components contribute to curation in CitedTCR.

Setup
As depicted in Fig. 1 where V t denotes the set of registered content and E t ⊆ V t × V t denotes their citation relationships. Although G t is managed by a set of users U t (as with other DApps), we impose the following assumption on the management of G t .
Assumption 1 One-to-one correspondence: Suppose that there is a one-to-one correspondence between U t and V t , i.e., f : U t → V t is bijective.
A one-to-one correspondence indicates an environment in which a user can neither post more than one content nor share one content as a co-applicant. This setting frees our model from several complex problems in DApps, such as spamming and sybil attacks, and makes curator assignment equivalent to node selection in G t .
We further assume that only one node x proposes an additional citation graph G t (composed of the references of x and x) in each period, and G t is not delisted once it is accepted into G t . This assumption and one-to-one correspondence make it possible to represent CitedTCR as a state transition {G t } ∞ t =0 that repeatedly determines whether to accept G t in each period 15 . In particular, the transition from G t to G t +1 can be summarized as follows: , and E x denotes directed edges from x to V x . (2) Curator assignment: Select n(≥ 2) 16 of nodes C t = {1, 2, · · · , n} as curators from V t \ V x , where n is an exogenous variable 17 .
1} denotes curator c's report for G t . Here, r = 0 and r = 1 designate reject and accept, respectively. (4) Reward computation: A pseudocode can be used to convert this state transition into Algorithms 1 and 2, in which, as commented, curator assignment (step 2) uses PPR, and reward computation (step 4) uses DG13. These algorithms include the following two properties. First, they integrate steps 2 and 3 as the Curation(n, C, R, G) function (Algorithm 14 Decentralized oracle is a broader concept than TCR, which includes every DApp responsible for consensus-building on off-chain contents, i.e., TCR can be interpreted as one of the decentralized oracle systems. The term decentralized oracle is often used in the context of prediction market, and representative platforms (e.g., Augur [28], Gnosis [33]) use the token staking for their consensus building as with the case of TCRs. 15 Therefore, when managing CitedTCR, we need to prepare in advance an initial state G 0 with a sufficient number of nodes and edges. 16 The condition n ≥ 2 is important for DG13 mechanism as we will see in Section 3.3. 17 Thus, |V x | needs an upper limit number which must satisfy |V t \ V x | ≥ n for all t .

Algorithm 1 State transition in CitedTCR
Algorithm 2 Report collection and curator assignment in CitedTCR R ′ ← reports collected from C ′ within a given period of time 4: if |R ′ | = n then end if 12: end function 2), which returns a set of reports R for the following four arguments: n, the number of reports; C, the set of nodes that are candidates for the curator; R, the initial value of the set of reports; and G, the graph containing C. This integration is intended to handle a case in which assigned curators do not provide their reports within a given period of time. In this case, Curation(n, C, R, G) continues to reselect new nodes as replacements for unresponsive curators until it collects n reports. Second, they return not only G t +1 and Θ but also the stock of reports R t +1 . This property is specific to DG13, whose reward computation leverages both the flow R t and stock R t of elicited reports as one of the multi-task peer prediction mechanisms. Algorithm 1 can be simplified by adopting other intratemporal mechanisms such as token staking.

PPR for curator assignment
PPR is an algorithm that recommends relatively important nodes to a given node through iterative random walking on a network (as a Markov chain). CitedTCR uses PPR for a curator assignment that selects C t as important nodes for x. In the example presented in Fig. 1 (b), the set of curators C t = {D, F , G} is selected from {A, B, C, · · · } \ {A, B} for the assessment of G t ({x, A, B}, {(x, A), (x, B)}), where C t is regarded as important nodes from the standpoint of x with the reference V x = {A, B}. Nodes such as V x are often referred to as base nodes in the PPR context, and are the key for computing relative importance.
To quantify the process of random walking, PPR leverages a transition matrix P, which in our case is |V t | × |V t |, and an element p i j designates the probability of transition from node i to node j. In the random walk as a Markov chain, the value of p i j becomes the reciprocal of node i's out-degree.
The simplified PageRank 18 score of V t is the dominant eigenvector (for eigenvalue 1) of P, which indicates the steady-state probability distribution as a result of iterative random walking. Moreover, the PPR score of V t is the dominant eigenvector of P P P R , which has the following modification to P [15]: where B is an additional |V t | × |V t | matrix whose element b i j becomes 1 if j is included in base nodes V x ; otherwise, it becomes 0 (i.e., b iA and b i B become 1 for all i and other elements become 0 in Fig. 1 (b)). We can interpret B/|V x | as another transition matrix in which all nodes in V t must jump to one node selected from V x uniformly at random. Thus, P P P R is the linear combination of the two transition matrices P and B/|V x | that represents biased random walking, which jumps to one of the base nodes with probability α in each step. Here, α ∈ [0, 1] is called a damping factor, and it can adjust the strength of bias as an exogenous parameter (α = 0.15 in most cases).
CitedTCR stochastically selects n curators in each period according to the PPR computed from P P P R . Below, we discuss three properties in this application of PPR. First, similar to PaperRank [13], CitedTCR considers G t to be undirected when using PPR. This is important because if PPR were on a DAG structure, its score would focus on the nodes with no out-edges (i.e., the oldest content in the case of citation graph) and would thus be unreliable for recommender systems. Second, as already mentioned, CitedTCR excludes V x from the candidates of C t . Although PPR scores high for the base nodes (reference nodes for x), we do not select them to avoid biased curation, in which assigned curators accept x simply to increase their number of citations 19 . Third, CitedTCR can encourage users to register high-quality content in G t , even though the frequency with which they become curators is weighted by PPR. This is experimentally confirmed in Section 4 using the PageRank score in G t as a proxy for quality.

DG13 for reward computation
DG13 and other peer-prediction mechanisms aim to elicit truthful information from the environment, in which users report the quality of a task. For example, in CitedTCR, n assigned curators C t = {1, 2, · · · , n} provide reports R t = {r G t 1 , r G t 2 , · · · , r G t n } on the quality of G t . To confirm whether a report is truthful, peer prediction assumes the stochastic signal s, which any c ∈ C t can observe from G t and can use as input information for r G t c . DG13 focuses on binary signals s ∈ {0, 1} and binary reports r (s) ∈ {0, 1} (0: reject; 1: accept). We use notation s G t c in the same manner as in reporting, i.e., curator c accepts adding G t to the G t if r G t c (s G t c ) = 1 and rejects it if This report is truthful in the r G t c (0) = 0 or r G t c (1) = 1 case and non-truthful in the r G t c (0) = 1 or r G t c (1) = 0 case. Note that r G t c and s G t c are sometimes denoted r c and s c when their task does not need to be emphasized. 18 Although original paper [27] uses Simplified PageRank as an introduction of model description, PageRank is the dominant eigenvector of the matrix P P R = (1 − α )P + α (1/ |V t |)1, where 1 is |V t | × |V t | matrix whose elements are all 1. Namely, P P R quantifies the random walking which, with probability α , jumps to one of all existing nodes uniformly at random (random-surfer model). This is to make PageRank work even in the directed network including dead-end loop or the node with no out-edges. 19 Note that even this modification cannot completely eliminate the biased curation, as long as the curation affects the future structure of G t .
Analyzing the strength of this bias is one of our future tasks. We add two more assumptions that are common in the literature on peer prediction for binary signals [6,17,36]. First, s, observed by each curator from each task, is positively correlated. Accordingly, when we randomly select another curatorĉ ∈ C t , both Pr (s c = 0|sĉ = 0) > Pr (s c = 0) and Pr (s c = 1|sĉ = 1) > Pr (s c = 1) hold for all c and c, regardless of the tasks 20 . This requires the propensity of assigned G t and the peer curators of c to be somewhat homogeneous 21 throughout each period. CitedTCR with a citation graph ensures such an environment by curator assignment based on PPR; this is unlike recent multi-task peer prediction [1,21], which becomes complex to relax this assumption. The second assumption is that each curator must select one reporting strategy from feasible choices. The set of feasible strategies in our model, presented in Fig. 2, is the union of mapping strategies and uninformative signal-independent strategies. Mapping strategies follow a mapping rule from signals to reports; however, the reports in uninformative strategies follow a given stochastic distribution independent of the observed signals. For the four possible mapping strategies under the assumption of binary signals, we specifically define a strategy that always reports truth as a truthful strategy, and a strategy that always reports non-truth as an opposite strategy.
Finally, if we let R c ⊂ R t be the set of all (intertemporal) reports that c has provided for multiple G t s, and let R * c be a special case in which all elements are truthful reports (i.e., c adopts a truthful strategy), the achievement of DG13 can be defined as follows: Rĉ holds for all c,ĉ, R c , Rĉ , and G t , where equality occurs only when both c andĉ adopt the opposite strategy 22 .
In other words, compared to any other strategy, the mechanism satisfying strong truthfulness can assign strictly higher expected rewards E θ G t c to the equilibrium by truthful strategies for almost all cases.
DG13, as a multi-task peer prediction mechanism, computes c's reward θ G t c using not only the reports that c and randomly selectedĉ produced in period t (i.e., r G t c , r G t c ) but also all reports that c andĉ produced until period 20 Accordingly, P r (s c = 1|sĉ = 0) < P r (s c = 1) and P r (s c = 0 |sĉ = 1) < P r (s c = 0) hold, simultaneously. 21 The homogeneity required for positively correlated signals is not as strong in binary signals as in multiple signals. 22 The original definition [32] generalizes both truthful strategy and opposite strategy as a permutation strategy to encompass the case of multiple (non-binary) signals.
, Vol. t (i.e., R c , Rĉ ). According to the original report [6] and a subsequent report for its generalization [32], DG13 can be formulated as where we use the following Kronecker's delta for the sake of convenience: is the reward for curation in period t. It is apparent that a value of 1 is obtained when two reports for G t return the same signal (r G t c , r G t c ) = (0, 0) or (1, 1); otherwise, the value is 0. δ r c ∈ {R c \ R t },rĉ ∈ {Rĉ \ R t } is a type of penalty that randomly selects two reports r c and rĉ produced by each curator before period t and compares them in the same manner. Assuming that c andĉ always report 1 for assigned tasks irrespective of the signals, θ t c = 0 holds because the penalty term becomes 1 even though r G t c and r G t c always represents a reward of 1. A similar result would be derived for the case of a 50-50 uninformative strategy (i.e., Pr (r = 0) = Pr (r = 1) = 0.5) because the expected value of reward terms and penalty terms both become 0.5. Although θ G t c takes the interval [−1, 1] because of the penalty, all rewards can be non-negative by adding 1 to all θ G t c as a basic reward. Dasgupta and Ghosh [6] indicated that the expected (net) reward E θ G t c is maximized in the equilibrium in which all curators adopt a truthful strategy by exerting efforts on signal observation under the assumption of positively correlated signals. Note that DG13 in CitedTCR must collectively compute rewards for previous reports after c andĉ both finish reporting three times. Three is the number that satisfies the minimum requirements for establishing multi-task peer prediction without loss of generality [32]: (i) two users, (ii) three total tasks, and (iii) two or more tasks per user, including at least one common task. Although each node curates many G t s during {G t } ∞ t =0 (as long as it has high quality), CitedTCR with iterative reward computation cannot satisfy (iii) when either c orĉ produces a report for the first time. Thus, we postpone reward computation until both c andĉ are sure to meet all minimum requirements by three reports 23 ; thus, DG13 can elicit truthful reports from curators.

EXPERIMENTAL STUDIES
Although Section 3 describes the utility of PPR and DG13, our study must assess how their combination contributes to the construction of the reliable list G t . In this section, we perform this assessment experimentally using two step-wise simulations that are both based on the DAG-structured dataset formatted from the arXiv high-energy physics theory (HEP-TH) citation network. In particularly, the simulation first uses only PPR to examine the strength of the incentive for registering high-quality content. It then incorporates DG13 to confirm the incentive for eliciting informative reports. All materials used for this experiment are available in the Github repository 24 . Fig. 3. Our experiments use a DAG structure with 1,421 time-ordered nodes, where green represents the citation relationships of the first 421 nodes, while red represents the citation relationships of the last 1,000 nodes. We consider the state transition {G t } 1000 t =0 by letting the green (subgraph) be G 0 .

Dataset
The arXiv HEP-TH citation network is a dataset provided by Stanford Network Analysis Project 25 (SNAP), which contains the citation relationships of academic papers in the HEP-TH category submitted from January 1993 to April 2003. We selected one component with 1,421 papers since January 2000, and constructed a DAG structure as depicted in Fig. 3 (powered by Cytoscape [31]). Here, the green component represents the citation relationships of the first 421 nodes, while the red component represents the citation relationships of the last 1,000 nodes (i.e., the green part is a subgraph of the DAG structure). Our experiments consider the green component the initial state G 0 and consider the state transition {G t } 1000 t =0 by sequentially adding the nodes and edges in the red component to G t .

Incentive for registering high-quality content
Thus far, we have assumed that CitedTCR tends to select curators more frequently from nodes that are regarded as important in G t , which serves as an incentive for users to register high-quality content. However, this assumption is not obvious because the curator assignment in each period is weighted by the PPR algorithm, which excludes even base nodes from the candidate list. To determine the true strength of the incentive for registering high-quality content, our first experiment computes the correlation between the frequency distribution for 1, 421 nodes to be selected as a curator because of sequential assignments up to G 1000 , and the (not simplified) PageRank score for Fig. 4. The first experiment computes Spearman's rank correlation coefficients between the frequency distribution of curator assignment up to G 1000 and the PageRank score to the DAG in G 1000 . The box plot for all 200 coefficients (10 times for each n = {1, 2, · · · , 20}) represents the moderate positive correlation, which increases as n increases and converges between 0.65 and 0.7. This result supports our assumption that CitedTCR tends to select curators more frequently from nodes that are considered important in G t .

1, 421 nodes in G 1000
26 . Here, the former designates the number of opportunities in which each node can earn rewards as a curator for the state transition {G t } 1000 t =0 , while the latter designates the importance of each node from the viewpoint of the entire DAG in G 1000 . We specifically computed Spearman's rank correlation coefficient 27 of these values 10 times 28 for each 20 cases with a different number of assigned curators: n = {1, 2, · · · , 20}. Fig. 4 summarizes the trend of 200 derived correlation coefficients in a box plot that depicts the median value as orange lines, 25/75 percentile as boxes, pseudo-maximum/minimum value as bars, and outliers as circles. This figure reveals that all correlation coefficients are within the range of 0.4 to 0.7, which can be regarded as moderately correlated. Moreover, they begin to converge between 0.65 and 0.7 when n exceeds 10. These results indicates that CitedTCR can retain sufficient incentive to register high-quality content, especially when it assigns more than 10 curators to G t , even though curator assignment relies on the PPR algorithm without base nodes.

Incentive for eliciting informative reports
After the simulation of curator assignment, the second experiment adds the DG13 mechanism to the first experiment to compute the expected reward E θ G t c stemming from r G t c and r G t c . To simulate the settings of DG13, in which the user reports the received signal s ∈ {0, 1} according to a given strategy, we stochastically allocate the strategy and s ∈ {0, 1} in advance to all 1, 421 nodes. In this experiment, the nodes are assumed to use either the truthful strategy or the aforementioned 50-50 uninformative strategy. The allocation of the two strategies is subject to the exogenous randomness parameter ϵ = {0.0, 0.1, · · · , 1.0}, where the expected number of nodes with the uninformative strategy is ϵ · 1, 421, and the expected number of nodes with the truthful strategy is (1 − ϵ) · 1, 421. Similarly, s ∈ {0, 1} is allocated to 1, 421 nodes by another exogenous 26 We set α = 0.15 in both the PageRank and the PPR algorithms. 27 We cannot use Pearson correlation coefficient because both frequency distribution and PageRank scores follow not normal distribution but power-law distribution. 28 Correlation coefficients are different in each of 10 computations because curators are assigned stochastically according to PPR algorithm, contrary to the constant PageRank score. parameter Pr (s = 0) = {0.0, 0.1, · · · , 1.0}. We computed E θ G t c by averaging the total reward generated in {G t } 1000 t =0 for each of the 121 environments comprising different allocations of these two exogenous parameters {0.0, 0.1, · · · , 1.0} × {0.0, 0.1, · · · , 1.0}, in which n = 10 and m = 0 are fixed in any environment (i.e., G t is always accepted into G t regardless of the reports).

CONCLUSION
In this study, we proposed CitedTCR, which incorporates the expertise of anonymous curators into existing TCRs by constructing a reliable citation graph, which is a common proxy for measuring the quality of technical content (e.g., academic papers, patents). To achieve this enhancement on a public peer-to-peer network, we leveraged the PPR algorithm and DG13 mechanism, where the former assigns appropriate curators and the latter elicits informative reports from the assigned curators. As a hybrid of network-based and token-based recommender systems, the combination of previous methods can lead to an incentive design that provides more reward tokens to users as they register high-quality content and continue producing informative reports. Although this incentive design has a different approach than existing TCRs that involve token staking, CitedTCR has sufficient utility, which was confirmed theoretically and experimentally. This study can contribute to the emerging discussion on TCRs through its use of a citation graph and peer-prediction mechanism.
However, for practical implementation of this proposal, two remaining issues must be addressed in future work. One involves relaxing the strong assumption of a one-to-one correspondence between users and nodes. Despite the importance of being spam-and sybil-proof for the robustness of peer-to-peer systems, CitedTCR without one-to-one correspondence is vulnerable to such attacks because the role of the applicant and its curators can easily overlap if users can create many sybil accounts or post many contents to G t . To overcome these attacks, an environment may be required in which curators are selected not from V t , but from U t , and U t has no incentive to create sybil accounts when posting multiple content. The indices or algorithms for addressing similar issues have been proposed in the relevant fields of CitedTCR such as SocialRank [34] in network-based recommender systems, h-index [16] in citation analysis, and Proof of Stake [14,30] in blockchain. It is therefore a topic for future research to assess the availability of such existing studies in CitedTCR.
The second remaining task is to design a valuable reward token. Although this study assumes that users act to maximize the amount of reward tokens, the power of tokens as an incentive is subject to their value, which is determined based on their utility, scarcity, and sustainability. CitedTCR therefore requires additional mechanisms to ensure the value of reward tokens as in the Bitcoin protocol, where block-reward halving fixes total supply, and difficulty adjustment stabilizes hash rate. A potential approach is to charge every applicant a token-based registration fee whose price is elastic and based on the frequency with which G t is proposed in a given period 30 . This approach is worth considering as a registration fee gives the reward token a utility and can serve to prevent spam attacks.

A EXPECTED REWARDS IN A SIMPLE TOKEN-STAKING SCHEME
Consider a simple token staking example in which n curators stake a fixed q number of tokens on one of the options. Let k be the amount of (net) rewards that curators can obtain when their selections become the consensus, and let p be the curators' subjective probability of the realization of this event. Then, the expected reward in this example is E(k) = pk − (1 − p)q.
Specifically, k is the redistribution of the total staked tokens nq among the curators who have staked on the consensus with the exception of one's own stake, q. Accordingly, if we let n * be the number of curators who have staked on the consensus, k = n n * q − q = n−n * n * q. By substituting this into the equation of E(k), we can derive the following condition: n * /(n−n * ) represents the odds ratio between the expected and actual value of the probability of one's choice becoming the consensus; i.e., the expected reward in the model takes a positive value only when we estimate the odds to be higher than their actual value and is zero as long as our estimation is precise (as a result of the zero-sum game). Furthermore, the expected reward under precise odds estimation is negative if we take the cost of curation into account 31 .
These results reveal that the token-staking scheme does not have sufficient incentive to engage curators in consensus building. Providing new reward tokens to curators in proportion to the score of the peer-prediction mechanism is one possible approach to this problem.

B PROOF OF THE STRONG TRUTHFULNESS OF THE DG13 MECHANISM
This proof uses notations that are compatible with Section 3.3. The expected value of the reward term δ r G t c ,r G t c depends not only on the results of r G t c and r G t c , but also on the probability distribution of input signals that each node observes in period t, as follows: Pr (s c , sĉ ) · δ r c (s c ),rĉ (sĉ ) , where Pr (s c , sĉ ) is the joint probability distribution of the signals that c andĉ can receive from G t . Note that the right-hand side does not require superscript G t because of the assumption of positively correlated signals.
As described in Section 3.3, the penalty term is the result of the comparison between two randomly picked reports that c andĉ produce prior to period t. We can write the expected value of the penalty in a similar form to the reward term as follows: Pr (s c )Pr (sĉ ) · δ r c (s c ),rĉ (sĉ ) .
This uses product distribution Pr (s c )Pr (sĉ ) rather than joint distribution Pr (s c , sĉ ) because the penalty term covers all intertemporal reports included in R c \ R t and Rĉ \ R t . 31 If we assume the cost of curation as c, the expected rewards in this example become E(k) = p(k − c) − (1 − p)(q + c). This extension shifts the condition for E(k ) = 0, from where [x] >0 and [x] <0 indicate that x is positive and negative, respectively 32 . It is apparent that E(θ G t c ) is maximized only when both c andĉ provide truthful reports (r (0) = 0, r (1) = 1) or opposite reports (r (0) = 1, r (1) = 0). Any other pattern, such as nodes using asymmetric strategies or always reporting the same signal, produces less expected values. Under the assumption of using one reporting strategy, this outcome indicates that E(θ G t c ) is maximized only when both x andĉ adopt either a truthful or opposite strategy. Thus, DG13 satisfies strong truthfulness. □