pix2tex¶

pix2tex.cli package¶

class pix2tex.cli.LatexOCR(arguments=None)¶

Bases: object

Get a prediction of an image in the easiest way

Initialize a LatexOCR model

Args:: arguments (Union[Namespace, Munch], optional): Special model parameters. Defaults to None.

pix2tex.cli.minmax_size(img: <module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/pix2tex/envs/latest/lib/python3.9/site-packages/PIL/Image.py'>, max_dimensions: ~typing.Tuple[int, int] | None = None, min_dimensions: ~typing.Tuple[int, int] | None = None) → <module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/pix2tex/envs/latest/lib/python3.9/site-packages/PIL/Image.py'>¶

Resize or pad an image to fit into given dimensions

Args:: img (Image): Image to scale up/down. max_dimensions (Tuple[int, int], optional): Maximum dimensions. Defaults to None. min_dimensions (Tuple[int, int], optional): Minimum dimensions. Defaults to None.
Returns:: Image: Image with correct dimensionality

pix2tex.gui package¶

pix2tex.api package¶

Submodules¶

pix2tex.api.app module¶

async pix2tex.api.app.predict(file: UploadFile = File(PydanticUndefined)) → str¶

Predict the Latex code from an image file.

Args:: file (UploadFile, optional): Image to predict. Defaults to File(…).
Returns:: str: Latex prediction

async pix2tex.api.app.predict_from_bytes(file: bytes = File(PydanticUndefined)) → str¶

Predict the Latex code from a byte array

Args:: file (bytes, optional): Image as byte array. Defaults to File(…).
Returns:: str: Latex prediction

pix2tex.api.app.root()¶: Health check.

pix2tex.api.streamlit module¶

pix2tex.dataset package¶

Submodules¶

pix2tex.dataset.arxiv module¶

pix2tex.dataset.arxiv.get_all_arxiv_ids(text)¶: returns all arxiv ids present in a string text

pix2tex.dataset.arxiv.read_tex_files(file_path: str, demacro: bool = False) → str¶

Read all tex files in the latex source at file_path. If it is not a tar.gz file try to read it as text file.

Args:: file_path (str): Path to latex source demacro (bool, optional): Deprecated. Call external de-macro program. Defaults to False.
Returns:: str: All Latex files concatenated into one string.

pix2tex.dataset.dataset module¶

pix2tex.dataset.demacro module¶

exception pix2tex.dataset.demacro.DemacroError¶: Bases: Exception

pix2tex.dataset.demacro.bracket_replace(string: str) → str¶: replaces all layered brackets with special symbols

pix2tex.dataset.demacro.pydemacro(t: str) → str¶

Replaces all occurences of newly defined Latex commands in a document. Can replace newcommand, def and let definitions in the code.

Args:: t (str): Latex document
Returns:: str: Document without custom commands

pix2tex.dataset.extract_latex module¶

pix2tex.dataset.extract_latex.find_math(s: str, wiki=False) → List[str]¶

Find all occurences of math in a Latex-like document.

Args:: s (str): String to search wiki (bool, optional): Search for displaystyle as it can be found in the wikipedia page source code. Defaults to False.
Returns:: List[str]: List of all found mathematical expressions

pix2tex.dataset.latex2png module¶

pix2tex.dataset.latex2png.extract(text, expression=None)¶

extract text from text by regular expression

Args:: text (str): input text expression (str, optional): regular expression. Defaults to None.
Returns:: str: extracted text

pix2tex.dataset.render module¶

pix2tex.dataset.render.render_dataset(dataset: ndarray, unrendered: ndarray, args) → ndarray¶

Renders a list of tex equations

Args:: dataset (numpy.ndarray): List of equations unrendered (numpy.ndarray): List of integers of size dataset that give the name of the saved image args (Union[Namespace, Munch]): additional arguments: mode (equation or inline), out (output directory), divable (common factor )

batchsize (how many samples to render at once), dpi, font (Math font), preprocess (crop, alpha off) shuffle (bool)
Returns:: numpy.ndarray: equation indices that could not be rendered

pix2tex.dataset.scraping module¶

pix2tex.dataset.scraping.recursive_search(parser: Callable, seeds: List[str], depth: int = 2, skip: List[str] = [], unit: str = 'links', base_url: str | None = None, **kwargs) → Tuple[List[str], List[str]]¶

Find math recursively. Look in seeds for math and further sites to look.

Args:: parser (Callable): A function that returns a Tuple[List[str], List[str]] of math and ids (for base_url) respectively. seeds (List[str]): Fist set of ids. depth (int, optional): How many iterations to look for. Defaults to 2. skip (List[str], optional): List of alreadly visited ids. Defaults to []. unit (str, optional): Tqdm verbose unit description. Defaults to ‘links’. base_url (str, optional): Base url to add ids to. Defaults to None.
Returns:: Tuple[List[str],List[str]]: Returns list of found math and visited ids respectively.

pix2tex.dataset.scraping.recursive_stack_exchange(seeds, depth=4, skip=[], base_url='https://math.stackexchange.com/questions/')¶: Recursively search through stack exchange questions

pix2tex.dataset.scraping.recursive_wiki(seeds, depth=4, skip=[], base_url='https://en.wikipedia.org/wiki/')¶: Recursivley search wikipedia for math. Every link on the starting page start will be visited in the next round and so on, until there is no math in the child page anymore. This will be repeated depth times.

pix2tex.models package¶

pix2tex.models.hybrid module¶

class pix2tex.models.hybrid.CustomVisionTransformer(img_size=224, patch_size=16, *args, **kwargs)¶

Bases: VisionTransformer

Args:: img_size (int, tuple): input image size patch_size (int, tuple): patch size in_chans (int): number of input channels num_classes (int): number of classes for classification head embed_dim (int): embedding dimension depth (int): depth of transformer num_heads (int): number of attention heads mlp_ratio (int): ratio of mlp hidden dim to embedding dim qkv_bias (bool): enable bias for qkv if True representation_size (Optional[int]): enable and set representation layer (pre-logits) to this value if set distilled (bool): model includes a distillation token and head as in DeiT models drop_rate (float): dropout rate attn_drop_rate (float): attention dropout rate drop_path_rate (float): stochastic depth rate embed_layer (nn.Module): patch embedding layer norm_layer: (nn.Module): normalization layer weight_init: (str): weight init scheme

pix2tex.models.vit module¶

class pix2tex.models.vit.ViTransformerWrapper(*, max_width, max_height, patch_size, attn_layers, channels=1, num_classes=None, dropout=0.0, emb_dropout=0.0)¶

Bases: Module

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(img, **kwargs)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pix2tex.utils package¶

pix2tex.utils.utils module¶

pix2tex.utils.utils.pad(img: <module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/pix2tex/envs/latest/lib/python3.9/site-packages/PIL/Image.py'>, divable: int = 32) → <module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/pix2tex/envs/latest/lib/python3.9/site-packages/PIL/Image.py'>¶

Pad an Image to the next full divisible value of divable. Also normalizes the image and invert if needed.

Args:: img (PIL.Image): input image divable (int, optional): . Defaults to 32.
Returns:: PIL.Image

pix2tex.utils.utils.post_process(s: str)¶

Remove unnecessary whitespace from LaTeX code.

Args:: s (str): Input string
Returns:: str: Processed image

pix2tex.utils.utils.seed_everything(seed: int)¶

Seed all RNGs

Args:: seed (int): seed

pix2tex¶

pix2tex.cli package¶

pix2tex.gui package¶

pix2tex.api package¶

Submodules¶

pix2tex.api.app module¶

pix2tex.api.streamlit module¶

pix2tex.dataset package¶

Submodules¶

pix2tex.dataset.arxiv module¶

pix2tex.dataset.dataset module¶

pix2tex.dataset.demacro module¶

pix2tex.dataset.extract_latex module¶

pix2tex.dataset.latex2png module¶

pix2tex.dataset.render module¶

pix2tex.dataset.scraping module¶

pix2tex.models package¶

pix2tex.models.hybrid module¶

pix2tex.models.vit module¶

pix2tex.utils package¶

pix2tex.utils.utils module¶

LaTeX-OCR

Navigation

Related Topics