Author: Ashok Nag

  • Block Chain – for my own understanding:

    I have explored the definitional attributes of blockchain in part1 of this series. In this part 2, I want to delve into the database properties of the blockchain.

    Part2: Blockchain as a database

    Storing a variety of objects in an organized manner is an art as well as a science. For example, when books are organized in a library, the classification of books is carried out using a hierarchical classification system known as the Decimal Classification system, which was first introduced by Melvil Dewey in 1873.

    When data about a class of objects is organized in a database, the science part of this organized data is related to data semantics, retrieval of data in a consistent manner subject to any constraints that may exist among the objects as well as amongst attributes of an object.  The data model encapsulates all these issues of data organization from the perspective of users of data. Historically, data models have evolved along with exponential growth in computing power and storage capability of computer hardware.   

    Currently, the relational data model is the pre-dominant data model used by enterprises.  E.F. Codd, while working as a computer scientist for IBM, first proposed the architecture of a relational data model in his 1970 paper titled “A Relational Model of Data for Large Shared Data Banks”. In his 1981 Turing Award lecture, he pointed out 3 main objectives that a relational database system tries to achieve. These are:

    1. data independence objective- database draws a sharp boundary between logical (i.e. business view of data) and physical view of data (i.e. machine storage or technical view of data).
    2. communicability objective- a simple intuitive way of organizing data so that business users can have a common understanding of data
    3. set-processing objective- application of the principles of set theory in the processing of two or more different datasets- a union of two sets, sub-setting of a set, a complement of a set, etc.

    Let us examine the extent to which a blockchain meets the above objectives. It would be in order to enter a caveat here. Blockchain was not designed to work as a distributed database but only as a distributed ledger. A ledger does not qualify as an enterprise-level database as we have argued in the part 1. So lacking any feature of a standard distributed database does not negate the usefulness of blockchain in many areas. However, my firm view is that cryptocurrencies as currently being offered by many blockchain platforms are destined to fail for their designed anonymity in monetary transactions.

    The first objective of data independence is clearly lacking in every blockchain data management framework.  The design of the data structure used by the Bitcoin platform or Ethereum platform is aimed at ensuring the integrity of all transactions and making the verification of the same through consensus algorithm by miner nodes as quickly as possible. For these reasons, search trees like Merkle Tree are used by Bitcoin and Merkle Patricia Tries by Ethereum (see Kamil Jezek(2020). As a result, from a business user perspective, the data structure is too complex and opaque for decision-making purposes. It may be said that transactional databases of cryptos were not designed to meet such requirements.

    As regards the third objective, the current implantation of blockchain technology for permission-less access to transactional data and the associated implementation of a consensus algorithm does not even aim at the segmentation of data by attributes of those undertaking transactions as well as transactions it selves. So this objective is absent by definition for cryptocurrency-oriented blockchains.

    But business use cases for blockchain need not be constricted to the worlds of cryptos. If we can take the definition of blockchain as a growing list of records, then it should be possible to marry blockchain with a proper relational database for deriving benefits of both the technology, immutability property of blockchain, and providing access to enriched transactional data for data analysis. A number of such applications have been created to query a blockchain data file.  Some of these applications are listed below.

    1.  Bitquery is an OLAP system built to provide business intelligence with regard to data stored in a blockchain. Data in this system is sourced from a blockchain using Graph Query Language and stored in multidimensional OLAP cube.  https://bitquery.io/

    2. Bitiodine is a tool, proposed by   Michele Spagnuolo et. al for analysing and profiling the Bitcoin network. The authors have suggested a methodology to “automatically parse the blockchain, cluster addresses, classify addresses and users, graph, export and visualize elaborated information from the Bitcoin network.”. The authors claim that their methodology can identify illegal or criminal use of cryptocurrency as in the case of “Silk Road” incident.   

    3. Chainalysis is another query tool developed on blockchain data for investigating cryptocurrency transactions. https://www.chainalysis.com/

    4. Nansen is another commercial software application to analyze on blockchain data. The software has built a repository of more than 70 million crypto wallets.  Like the applications described above, the ability to monitor flow of funds from one address to another is a key feature of this application.                      https://www.nansen.ai/about

    5. Abe is another software that reads “the Bitcoin block file, transforms and loads the data into a database, and presents a web interface similar to Bitcoin Block Explorer. Abe runs on PostgreSQL, MySQL’s InnoDB engine, and SQLite. Other SQL databases may work with minor changes.”

    https://github.com/bitcoin-abe/bitcoin-abe#readme

    6. Kondor and his associates of Eötvös Loránd University of Hungary have analysed Bitcoin data and created a Bitcoin Transaction Network that provides bitcoin transaction data as extracted with the bitcoind client. Data is provided in a tab-separated TSV file.

    https://datadryad.org/stash/dataset/doi:10.5061/dryad.qz612jmcf

    It is important to note here that analysis of blockchain data involves analysis of graph data. Graph analytics has been extensively used in social network analysis. Such analysis can provide insight into the flow of money/ values from one node in a blockchain network to another node and identify addresses that relate to a particular wallet with a certain probability. The “Graph protocol” nicknamed “Google of the blockchains” has been created for indexing and querying data from blockchains, starting with Ethereum. Initially, it provided a hosted service for free but it has been now announced that the company will cease to provide the hosted services in 2023.

    Finally, it is quite clear that blockchain should be considered as a repository of transactions but not as a database proper. In this age of the internet when 2.5 quintillions (2X10^18) bytes of data is produced every day, it would be next to impossible to adhere to Codd’s objectives to store even petabytes (1 million GB) of data in a proper database.  For example, Hadoop which has been designed to handle Big Data is a framework that allows files of structured as well as unstructured data stored in multiple computers to be accessed, retrieved, and analyzed. So blockchain has its own uses but for an enterprise, all business data cannot be or rather should not be stored in a blockchain.  For example, Walmart Canada has successfully built a private blockchain to solve supply-chain challenges on Hyperledger Fabric, but resting it on top of a legacy system.

    References:

    Dinh Tien Tuan Anh et.al (2017),  Untangling Blockchain: A Data Processing View of Blockchain Systems,   DOI 10.1109/TKDE.2017.2781227, IEEE

    Jules Azad Emery, Matthieu Latapy(2021). Full Bitcoin Blockchain Data Made Easy. IEEE/ACM International Conference on Advances in Social Network Analysis and Mining (ASONAM 2021), Nov 2021, The Hague (virtual), Netherlands. hal-03443053

    Kamil Jezek(2020). Ethereum Data Structures. (August 2020), https://doi.org/10.1145/1122445.1122456

    Kate Vitasek, John Bayliss, Loudon Owen, and Neeraj Srivastava, How Walmart 92022) , Canada Uses Blockchain to Solve Supply-Chain Challenges in Harvard Business Review, January 2022

    Kondor D, Po´ sfai M, Csabai I, Vattay G (2014) Do the Rich Get Richer? An Empirical Analysis of the Bitcoin Transaction Network. PLoS ONE 9(2): e86197. doi:10.1371/journal.pone.0086197

    McGinn D, D. McIlwraith and Y. Guo, Toward Open Data Blockchain Analytics: A Bitcoin Perspective In Royal Society Open Science,  
    https://doi.org/10.48550/arXiv.1802.07523

    Spagnuolo, M., Maggi, F., Zanero, S. (2014), BitIodine: Extracting Intelligence from the Bitcoin Network” In: Christin, N., Safavi-Naini, R. (eds) Financial Cryptography and Data Security. FC 2014. (Lecture Notes in Computer Science), vol 8437.

    Xu Cheng , Ce Zhang, Jianliang Xu(2019), vChain: Enabling Verifiable Boolean Range Queries over Blockchain Databases in  2019 International Conference on Management of Data (SIGMOD ’19), June 30–July 5, 2019,,

    Yue Kwok-Bun,  Karthika Chandrasekar, and  Hema Gullapalli (2019),  Storing and Querying Bitcoin Blockchain Using SQL Databases  in  Information Systems Education Journal Vol 17(4)

  • Block Chain – for my own understanding:

    Part1: Definitional Issues

    I am trying to understand the concept and use cases of blockchain. I plan to put up a series of blogs on this subject as I navigate through the complexity of the subject. I will be more than happy to receive any response pointing out flaws in my understanding. Please write to me at ashok.nag@gmail.com.

    Definition of Blockchain:

    Bitcoin.org: The blockchain is a shared public ledger on which the entire Bitcoin network relies. All confirmed transactions are included in the blockchain. It allows Bitcoin wallets to calculate their spendable balance so that new transactions can be verified thereby ensuring they’re actually owned by the spender. The integrity and the chronological order of the blockchain are enforced with cryptography.

    Ethereum: A blockchain is a public database that is updated and shared across many computers in a network.

    Wikipedia: A growing list of records, called blocks, that are securely linked together using cryptography. The blocks are timestamped and chained with the previous block by incorporating a cryptographic hash of the previous block.

    IBM: Blockchain is a shared, immutable ledger that facilitates the process of recording transactions and tracking assets in a business network.

    Oracle: Blockchain is defined as a ledger of decentralized data that is securely shared. Blockchain technology enables a collective group of select participants to share data. With blockchain cloud services, transactional data from multiple sources can be easily collected, integrated, and shared. Data is broken up into shared blocks that are chained together with unique identifiers in the form of cryptographic hashes.

    Our definition: Blockchain is a digital record management system with the following properties:

    1. Records are grouped into blocks with a pre-defined limit for the size of a block. The size of a block determines the number of records of a given size that a block can include. Data in blocks can only be appended and not deleted or modified.
    2. The process of creating a new block and adding it to a given chain determines the type of blockchain. There are mainly two types of blockchain, namely permissionless and permissioned. The former one is called public blockchain as access to it is open to all. The latter type restricts access to authenticated users only and is also known as a private blockchain. 
    3. Blocks are mostly stored in a key-value database. Bitcoin, Ethereum, and many other cryptocurrencies use LevelDB database of Google. Cryptographic hashes are used as identifiers for a block as well as its records. In other words, hashes are the keys and the data as the value.  
    4. The “chain” part in “Blockchain” refers to the fact that two consecutive blocks are linearly linked as a parent and a child. The block which has no parent is called the Genesis block of a particular chain.  The “chaining process” entails the incorporation of the hash value of a parent block in the header of the child block. A block’s header contains all the metadata of the block. This linking of parent and child through a hashing process ensures
    5. immutability of data of a child’s parent block and then all its ancestors up to the genesis block

    Before explaining the components of a blockchain in more detail, we need to clarify the term “distributed ledger” and its connection, if any, to the concept of “distributed database”.

    A ledger, primarily an accounting term, is a date-wise summary of all transactions of values, details of which are kept in a supporting book called “journals”. The word” ledger” was used in the blockchain context because its first use case was in the creation of a decentralized currency system. Since a ledger is also a record-keeping system, the term has persisted.  But the question remains whether a ‘distributed ledger” is conceptually and practically equivalent to a “distributed database”. The answer is a big No.

    Let us first demystify the term “distributed”. Oracle has defined a distributed database as “a set of databases stored on multiple computers that typically appears to applications as a single database. Consequently, an application can simultaneously access and modify the data in several databases in a network”. 

    Ozsu and valduriez have defined a distributed database and database management system as a “    Collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (distributed DBMS) is then defined as the software system that permits the management of the distributed database and makes the distribution transparent to the users” (page 3). It is important to note that a distributed database system must also have an associated database management system to enable end users to access, query ,and generate user defined reports. Although a key-value database is also called a database, it provides limited support for data manipulation to discover patterns within the database and, therefore, the DBMS associated with it is very rudimentary.

    Let us now look into the ledger aspect of blockchain-based databases. What does a business ledger look like? IBM, while highlighting the deficiencies of current business ledgers, has given the following example.

    Source:     https://developer.ibm.com/tutorials/cl-blockchain-basics-intro-bluemix-trs/

    IBM states that the current business ledgers are “inefficient, costly, and subject to misuse and tampering.” If this is the “reality”, then all the balance sheets and P&L accounts of IBM itself are faulty, and cannot be trusted by investing public as well as any tax authority”. Be that as it may, it is undoubtedly true that no enterprise will maintain its transactions only in a blockchain database although the immutability property of blockchain may have its own use.

    For pedagogical purpose, let us consider the following example of an accounting database model

    Source: https://towardsdatascience.com/how-to-build-an-accounting-system-using-sqlite-2ce31f8b8652

    Obviously, a proper industry standard accounting information system (AIS) software will require a much more complex database. For our limited purpose, it suffices to note that a ledger book database cannot be a list of transactions only. A number of complex rules must be enforced on the database to create a proper double-entry accounting system.  For example, a bank reconciliation process that matches a company’s bank statement with its cashbook balances is automated in many accounting information systems(AIS).  The participants in this process must be authorized and cannot be anonymous validators. A blockchain-based database cannot be a solution for such essential requirements of an AIS, although underlying transactions can be stored in a private blockchain for future auditing requirements (see the article: Blockchain as the Database Engine in the Accounting System).

    Let us consider the information management issues in regard to the Letter of Credit (LC), the most important banking document for facilitating international trade. In general, there are 5 five parties involved in an LC-based international payment settlement process. They are- importer, exporter, issuing bank, advising bank, and confirming bank. Today, banks use the SWIFT platform’s category 7 message type for sending and receiving messages between these parties. It is eminently possible to use a permissioned blockchain platform for sending and receiving LC-related messages. But it is impossible to use a public blockchain platform for this purpose. Furthermore, such a blockchain must rest at the top of a standard relational database to enable payment settlement and recording of underlying credit flow.

    The moot point is that a blockchain-mediated database is extremely useful for record-keeping purposes and not for enabling contestable transactions involving values between a network of legally connected parties. For enforcement of any contract between two parties, the foremost requirement is the identification of the parties involved. It is immaterial whether the parties are connected in a network managed by a centralized authority or not. The anonymity of transacting parties should be considered the weakest part of a blockchain-based transaction system and not its strongest one.  As we know a chain is as strong as its weakest part.  

    References:

    References:

    Musa Aujara Shamsuddeen (April 2019) , Documentary Letter of Credit Discrepancy and Risk Management in the Nigerian: Crude Oil Export; Ph.D Thesis submitted to University of Central Lancashire

    Özsu  M. Tamer & Valduriez   Patrick (2011)  Principles of Distributed Database Systems; 3rd Edition 2011

    Tan Boon Seng , Kin Yew Low (2019) Blockchain as the Database Engine in the Accounting System in  Australian Accounting Review No. 89 Vol. 29 Issue 2

  • Indo-Pacific Economic Framework- A surrogate NATO for South and East Asian Countries?

    To understand the driver of the Indo-Pacific Economic Framework(IPEF) that has been launched on 24 May by 13 countries of South-East Asia including 4 members of the QUAD group and most of the ASEAN countries, we need to understand the interplay of regional and global aspirations of and challenges faced by these countries. 

    The first quarter of the present century has seen a quantum leap in humanity’s progress in science and technology creating the possibility of bringing an end to the childhood of humanity. A possibility but not a certainty. On the contrary, a more than even chance is emerging about a nuclear armageddon bringing an end to human civilization as we know it now. The 9/11 terror attack, the financial crisis of 2007-08, America’s war on terror and its exit from Afghanistan, the disproportionate impact of the COVID-19 pandemic on developed countries and now the Ukraine war -all are pointers to an irreconcilable conflict of interests among nations states of today which can be resolved only in a theater of war and destruction. Globally, there are two conflicting intertwined players- a declining but still globally dominant power, both economically and technologically, and a rising power with the ability to challenge the dominant one on both these fronts.

    The genesis of IPEF can be traced back to a 2018 document – declassified in January 2021- on Indo-Pacific Strategic Framework prepared by the United States National Security Council(USNSC). The foremost security challenge faced by the USA, as identified by the USNSS is: “How to maintain US strategic primacy in the Indo-Pacific region and promote a liberal economic order while preventing China from establishing new, illiberal spheres of influence, and cultivating areas of cooperation to promote regional peace and prosperity?”.

    The document emphasizes the threat posed by China’s rise as a technology superpower. “China seeks to dominate cutting-edge technologies, including Artificial Intelligence and Bio-genetics, and harness them in the service of authoritarianism. Chinese dominance in these technologies would pose profound challenges to free societies.”

    This 2018 strategy document also underpins India’s pivotal role in containing as well as counterbalancing China’s aggressive posture in the Indo-Pacific region. The document is quite candid about USA’s objective in regard to India- 

    “Accelerate India’s rise and capacity to serve as a net provider of security and Major Defense Partner; solidify an enduring strategic partnership with India, underpinned by a strong Indian military able to effectively collaborate with the United States”.

    The Indo-Pacific Strategy document issued in February 2022 by the US government espouses the same line of thought articulated by the 2018 document. The word “economic” is added to provide a veneer of creating a trading block like it was envisaged in the Trans-Pacific Partnership Agreement(TPP).  TPP did not take off as US Senate failed to ratify it.  Being a trade agreement, ratification by congress was a necessity. By making IPEF a framework document, a kind of declaration of intent, it should be possible to avoid the requirement of any legislative approval by all signatories. The word Economic is also slightly problematic since there are already two agreements for facilitating trade among countries of this region. The Comprehensive and Progressive Agreement for Trans-Pacific Partnership (CPTPP) is a free trade agreement (FTA) among 11 countries including Canada, Chile, Mexico, and Peru. The CPTPP was concluded on 23 January 2018 in Tokyo, Japan, and signed on 8 March 2018 in Santiago, Chile. Regional Comprehensive Economic Partnership Agreement (RCEP) is another free trade agreement between ASEAN countries and Australia, China, Japan, Korea, and New Zealand. India was a member of the drafting committee of RCEP but eventually did not join it because it would put India in a disadvantageous situation vis-à-vis China in a free trade regime. It is interesting to note that 3 ASEAN countries having close relationships with China, namely Cambodia, Laos and Myanmar kept them away from IPEF.

    Four areas of cooperation have been identified in the joint statement issued by the 11 signatory countries to IPEF. In each of them, it is difficult to see a convergence of interest of all signatory countries. For example, let us consider the Clean Energy, Decarbonization, and Infrastructure component of IPEF. Although India is a signatory to the Paris agreement that requires all countries to achieve net-zero carbon emission by 2050, the Indian prime minister promised to cut its emissions to net-zero by 2070 only.  China has committed to reaching net-zero status by 2060 while US and EU have committed to reaching the target by 2050.  India’s overriding national interest of poverty eradication by maintaining its growth momentum over a longer time will not allow it to toe its de-carbonization policies to that of developed countries who are already enjoying a lifestyle that has led to a much higher per capita carbon emission than is the case with India.

    As regards the Trade component of the framework, the declarative statements are as general as possible. Out of 13 participating countries in the IPEF framework, only USA and India are not part of another regional free trade agreement, Regional Comprehensive Economic Partnership or RCEP. China is a member of the RCEP trade block. India was a member of the RCEP drafting committee since the committee began its work in 2011 and just before the signing date of the agreement, in November 2019, it opted out. As a result, India would be out of two existing trade blocks that cover almost all important counties of the region- RCEP and CPPTP. So it is difficult to envisage what new terms and conditions can IPEF will bring in to assuage India’s concerns.

    As regards the Supply Chain component of IPEF, the statement says:” ensure access to key raw and processed materials, semiconductors, critical minerals, and clean energy technology”. Among the manufactured products only “semiconductors” is mentioned. The most important omission is Artificial Intelligence related products which represent the cutting-edge technologies of today.

    To conclude, on the high table of the 13 signatory countries of IPEF, the USA is bringing nothing substantial to offer. It is more of a taker than a giver. IPEF may turn out to be more of a hubris of a declining power.

  • Bear Hug

    A daughter is crying on her cell

    Oh my dearest mom

    Believe me, believe me

    The bombs are falling all around.

    The mother from Muscovy laughs aloud

    Are you awake, my sweetheart

    A hallucination, a nightmare no doubt.

    My bear is a polar one, eager to hug you all

    Proselytization is not Jesus’s call.

    My dear child

    Nothing to fear

    The winner will not take it all.

    The daughter cries out

    Mom, my dearest mom

    I love you

    I love you most.

    When no more call reaches to you

    Believe then, believe then

    That your Bear has taken me out.

    Ferocity, deception, and sheer arrogance

    Will prevail

    The winner will take it all.

    @apology to Abba for the line “winner takes it all”

  • Central Bank Digital Currency

    I am providing a link below to the latest version of my paper. The Reserve Bank of India has declared that it will start a pilot project on the issuance of CBDC. The former Governor Subbarao has strongly cautioned RBI against any interest payment on account-based CBDC. Please see my detailed discussion on various issues related to this subject.

    The key takeaways from my paper:

    1. CBDC should not be a mutated version of Bitcoin type digital coin.
    2. CBDC must possess three properies of paper currency fully and comprehesnively: No third party verification is required to transfer digital currency from a holder to a recipent.
    3. No account balance concept is introduced and therefore no double spending is possible.
    4. A holder is a legal owner unless proved otherwise.
    5. All digital currency are of a certain denomination and every transfer is legitimate as long as wallets are genuine. A proper application of public key cryptography and hash function allows a digital currency to mimic it’s paper based counterpart.
    6. The only difference with paper curreency is that transactions based on digital currency are not competely anonymous. But investigation of audit trail of a particular digital note would be very complex and costly. So it would not be easy.
    7. Double spending is prevented because notes are automatically modified in the wallet of the sender which will not be accepted by another receiver’s wallet. No internet is required for a transaction to take place and notes cannot be sent through internet.
    8. No requirement of a blockchain database.
    9. It is neither an account-based nor a token based payment system.
    10. Notes can travel back to issuer- the central bank- and get destroyed by the central bank.

    https://docs.google.com/document/d/1b9L8OGBUy7rVjvMVdFmnvP_9q1uNAG1i/edit?usp=sharing&ouid=109936802430456407164&rtpof=true&sd=true

  • COVID-19- A cross country analysis

    Introduction:

    The death toll of COVID-19 has reached 2.9 million by April 2021, a little less than 0.04% of the world population. In Wikipedia’s list of the largest known epidemics and pandemics caused by an infectious disease, COVID19 is ranked 8th in terms of its death tolls1. The deadliest known pandemic in history, the Black Death of 1346-1353 in comparison killed between 70-200 million people. Thus, humanity has been able to contain, if not eradicate, nature’s fury by constant progress in scientific knowledge and technology. And, to paraphrase Shakespeare, “therein lies the rub”2. The incidence of death due to COVID19 has been the largest in the most advanced country of the world- that is the USA.  Till January 2021, the USA accounted for around 20% of total recorded death worldwide due to COVID-19.  The top 5 countries, namely the USA, Brazil, India, Mexico, and UK accounted for a little less than 49% of total deaths. The share of these 5 countries in the world population was around 27% and excluding India the other 4 countries had only 9.5% of the world’s population3.  This huge disparity among various countries in terms of the mortality impact of COVID -19 calls for a cross-country analysis of the same.

    The objective of the present paper is to identify the distinctive characteristics of the countries recoding 1st wave of COVID-19 deaths of varying intensities. Since country is our unit of analysis, data on various proximate causes of death of a COVID-19 infected person may not be available at that level. However, available micro-level- studies of patients of a single hospital or a local administrative unit -like a county- can be relied upon to identify the possible factors like the presence of certain specific co-morbidities that could determine the fatality rate of the COVID-19 patients.

    The paper is organized into 3 main sections. Section I reviews the literature on the characteristics of COVID-19 patients and its impact on their subsequent survival. The parameters that have been used in creating a scoring system to determine the survival probability of COVID-19 patients are also reviewed. It is an accepted fact that higher mortality is expected for COVID-19 patients with chronic lung diseases like asthma. In this respect, the relevance of the so-called ‘hygiene hypothesis” is briefly discussed.  Section II discusses the data and methodology used. Section III presents the results. A concluding section follows.

    The paper can be downloaded from the link below:

    https://drive.google.com/file/d/1vptzjUt_yNVM43zMpAprZ7g0ifI7WoT3/view?usp=sharing