2.2 KiB
Published Crawled Data
Starchart publishes all crawled data. This document explains the format(s) and the directory structure of the published data.
Directory Structure
(lab)➜ starchart tree data
data
└── git.batsense.net
├── instance.yml
└── realaravinth
├── analysis-of-captcha-systems
│ └── publiccode.yml
└── user.yml
Snippet of data crawled from git.batsense.net
Forge
Each forge instance gets its own directory in the repository root path specified in the configuration. All data crawled from an instance will be stored in the instance's directory only.
Each forge instance directory contains an instance.yml
file that
describes the instance. The schema of instance.yml
might change as
starchart is currently under development.
---
hostname: git.batsense.net
forge_type: gitea
example instance.yml
User
A forge instance's user gets their own subdirectory in starchart and an
user.yml
to describe them. Information on all their repositories will be stored under
this subdirectory.
Like instance.yml
, user.yml
schema is not finalized too.
---
hostname: git.batsense.net
username: realaravinth
html_link: "https://git.batsense.net/realaravinth"
profile_photo: "https://git.batsense.net/avatars/bc11e95d9356ac4bdc035964be00ff0d"
example user.yml
Repository
Repository information is stored under the owner's subdirectory.
Currently, partial support for
publiccodeyml is implemented. So all
repository information is stored in publiccode.yml
under the
repository subdirectory.
---
publiccodeYmlVersion: "0.2"
name: git.batsense.net
url: "https://git.batsense.net/realaravinth/git.batsense.net"
description:
en:
shortDescription: "Instance administration logs and discussions pertaining to this Gitea instance. Have a question about git.batsense.net? Please create an issue on this repository! :)"
example publiccode.yml implemented by starchart
See forgeflux-org/starchart#3 and publiccodeyml/publiccodeyml/discussions for more information.