API¶
thutil
¶
The package for general ulitities.
Developed and maintained by C.Thang Nguyen
Modules:
Attributes:
__description__ = 'Python package'
module-attribute
¶
__long_description__ = 'ML based applications '
module-attribute
¶
__author__ = 'thangckt'
module-attribute
¶
__version
¶
Attributes:
-
TYPE_CHECKING
– -
VERSION_TUPLE
– -
version
(str
) – -
__version__
(str
) – -
__version_tuple__
(VERSION_TUPLE
) – -
version_tuple
(VERSION_TUPLE
) –
TYPE_CHECKING = False
module-attribute
¶
VERSION_TUPLE = Tuple[Union[int, str], ...]
module-attribute
¶
version: str = '0.1.dev150+gb76f61a.d20241231'
module-attribute
¶
__version__: str = '0.1.dev150+gb76f61a.d20241231'
module-attribute
¶
__version_tuple__: VERSION_TUPLE = (0, 1, 'dev150', 'gb76f61a.d20241231')
module-attribute
¶
version_tuple: VERSION_TUPLE = (0, 1, 'dev150', 'gb76f61a.d20241231')
module-attribute
¶
config
¶
Functions:
-
validate_config
–Validate the config file with the schema file.
-
load_setting_file
–Load data from a JSON or YAML file.
-
load_jsonc
–Load data from a JSON file that allow comments.
-
unpack_dict
–Unpack one level of nested dictionary.
-
write_yaml
–Write data to a YAML file.
-
read_yaml
–Read data from a YAML file.
validate_config(config_dict=None, config_file=None, schema_dict=None, schema_file=None, allow_unknown=False, require_all=False)
¶
Validate the config file with the schema file.
Parameters:
-
config_dict
(dict
, default:None
) –config dictionary. Defaults to None.
-
config_file
(str
, default:None
) –path to the YAML config file, will override
config_dict
. Defaults to None. -
schema_dict
(dict
, default:None
) –schema dictionary. Defaults to None.
-
schema_file
(str
, default:None
) –path to the YAML schema file, will override
schema_dict
. Defaults to None. -
allow_unknown
(bool
, default:False
) –whether to allow unknown fields in the config file. Defaults to False.
-
require_all
(bool
, default:False
) –whether to require all fields in the schema file to be present in the config file. Defaults to False.
Raises:
-
ValueError
–if the config file does not match the schema
load_setting_file(filename: Union[str, Path]) -> dict
¶
load_jsonc(filename: str) -> dict
¶
Load data from a JSON file that allow comments.
unpack_dict(nested_dict: dict) -> dict
¶
Unpack one level of nested dictionary.
write_yaml(jdata: dict, filename: Union[str, Path])
¶
Write data to a YAML file.
read_yaml(filename: Union[str, Path]) -> dict
¶
Read data from a YAML file.
io
¶
Functions:
-
combine_text_files
–Combine text files into a single file in a memory-efficient. Read and write in chunks to avoid loading large files into memory
-
download_rawtext
–Download raw text from a URL.
combine_text_files(files: list[str], output_file: str, chunk_size: int = 1024)
¶
Combine text files into a single file in a memory-efficient. Read and write in chunks to avoid loading large files into memory
Parameters:
-
files
(list[str]
) –List of file paths to combine.
-
output_file
(str
) –Path to the output file.
-
chunk_size
(int
, default:1024
) –Size of each chunk in KB to read/write. Defaults to 1024 KB.
download_rawtext(url: str, outfile: str = None) -> str
¶
Download raw text from a URL.
path
¶
Functions:
-
make_dir
–Create a directory with a backup option.
-
make_dir_ask_backup
–Make a directory and ask for backup if the directory already exists.
-
ask_yes_no
–Asks a yes/no/backup question and returns the response.
-
list_paths
–List all files/folders in given directories and their subdirectories that match the given patterns.
-
collect_files
–Collect files from a list of paths (files/folders). Will search files in folders and their subdirectories.
-
change_pathname
–change path names
-
remove_files
–Remove files from a given list of file paths.
-
remove_dirs
–Remove a list of directories.
-
remove_files_in_paths
–Remove files in the
files
list in thepaths
list. -
remove_dirs_in_paths
–Remove directories in the
dirs
list in thepaths
list. -
copy_file
–Copy a file/folder from the source path to the destination path.
-
move_file
–Move a file/folder from the source path to the destination path.
-
scan_dirs
–Check if the folders contains and not contains some files.
make_dir(path: str, backup: bool = True)
¶
Create a directory with a backup option.
make_dir_ask_backup(dir_path: str)
¶
Make a directory and ask for backup if the directory already exists.
ask_yes_no(question: str) -> str
¶
Asks a yes/no/backup question and returns the response.
list_paths(paths: list[str], patterns: list[str], recursive=True) -> list[str]
¶
List all files/folders in given directories and their subdirectories that match the given patterns.
Parameters¶
paths : list[str] The list of paths to search files/folders. patterns : list[str] The list of patterns to apply to the files. Each filter can be a file extension or a pattern.
Returns:¶
List[str]: A list of matching paths.
Example:¶
folders = ["path1", "path2", "path3"]
patterns = ["*.ext1", "*.ext2", "something*.ext3", "*folder/"]
files = list_files_in_dirs(folders, patterns)
Note:¶
- glob() does not list hidden files by default. To include hidden files, use glob(".*", recursive=True).
- When use recursive=True, must include
**
in the pattern to search subdirectories.- glob("*", recursive=True) will search all FILES & FOLDERS in the CURRENT directory.
- glob("*/", recursive=True) will search all FOLDERS in the current CURRENT directory.
- glob("**", recursive=True) will search all FILES & FOLDERS in the CURRENT & SUB subdirectories.
- glob("**/", recursive=True) will search all FOLDERS in the current CURRENT & SUB subdirectories.
- "/*" is equivalent to "".
- "/*/" is equivalent to "/".
- IMPORTANT: "/" will replicate the behavior of "**", then give unexpected results.
collect_files(paths: list[str], patterns: list[str]) -> list[str]
¶
Collect files from a list of paths (files/folders). Will search files in folders and their subdirectories.
Parameters¶
paths : list[str] The list of paths to collect files from. patterns : list[str] The list of patterns to apply to the files. Each filter can be a file extension or a pattern.
Returns:¶
List[str]: A list of paths matching files.
change_pathname(paths: list[str], old_string: str, new_string: str, replace: bool = False) -> None
¶
change path names
Parameters:
-
paths
(list[str]
) –paths to the files/dirs
-
old_string
(str
) –old string in path name
-
new_string
(str
) –new string in path name
-
replace
(bool
, default:False
) –replace the old path name if the new one exists. Defaults to False.
remove_files(files: list[str]) -> None
¶
Remove files from a given list of file paths.
Parameters:
-
files
(list[str]
) –list of file paths
remove_dirs(dirs: list[str]) -> None
¶
Remove a list of directories.
Parameters:
-
dirs
(list[str]
) –list of directories to remove.
remove_files_in_paths(files: list, paths: list) -> None
¶
Remove files in the files
list in the paths
list.
remove_dirs_in_paths(dirs: list, paths: list) -> None
¶
Remove directories in the dirs
list in the paths
list.
copy_file(src_path: str, dest_path: str)
¶
Copy a file/folder from the source path to the destination path.
move_file(src_path: str, dest_path: str)
¶
Move a file/folder from the source path to the destination path.
scan_dirs(dirs: list[str], with_files: list[str], without_files: list[str] = []) -> list[str]
¶
Check if the folders contains and not contains some files.
Parameters:
-
dirs
(list[str]
) –The paths of dirs to scan.
-
with_files
(list[str]
) –The files that should exist in the path.
-
without_files
(list[str]
, default:[]
) –The files that should not exist in the work_dir. Defaults to [].
Returns:
-
list[str]
–list[str]: The paths that meet the conditions.
pkg
¶
Functions:
-
create_logger
–Create and configure a logger with console and optional file handlers.
-
check_package
–Check if the required packages are installed
-
get_func_args
–Get the arguments of a function
-
dependency_info
–Get the dependency information
create_logger(logger_name: str = None, log_file: str = None, level: str = 'INFO', level_logfile: str = None, format_: str = 'info') -> logging.Logger
¶
Create and configure a logger with console and optional file handlers.
check_package(package_name: str, git_repo: str = None, auto_install: bool = False, extra_commands: list[str] = None) -> None
¶
Check if the required packages are installed
_install_package(package_name: str, git_repo: str = None) -> None
¶
Install the required package
package_name (str): package name
git_repo (str): git path for the package
get_func_args(func)
¶
Get the arguments of a function
dependency_info(modules=['numpy', 'polars', 'thutil', 'ase']) -> str
¶
Get the dependency information
sth2sth
¶
Functions:
file2str(file_path: Union[str, Path]) -> str
¶
str2file(text: str, file_path: Union[str, Path]) -> None
¶
file2list(file_path: Union[str, Path]) -> list[str]
¶
list2file(text_list: list, file_path: Union[str, Path]) -> None
¶
float2str(floatnum, decimals=6)
¶
convert float number to str REF: https://stackoverflow.com/questions/2440692/formatting-floats-without-trailing-zeros
Parameters:
-
floatnum
(float
) –float number
-
fmt
(str
) –format of the output string
Returns:
-
s
(str
) –string of the float number
stuff
¶
Functions:
-
chunk_list
–Yield successive n-sized chunks from
input_list
. -
fill_text_center
–Create a line with centered text.
-
fill_text_left
–Create a line with left-aligned text.
-
fill_text_box
–Put the string at the center of | |.
chunk_list(input_list: list, n: int) -> Generator
¶
Yield successive n-sized chunks from input_list
.
fill_text_center(input_text='example', fill='-', max_length=60)
¶
Create a line with centered text.
fill_text_left(input_text='example', left_margin=15, fill='-', max_length=60)
¶
Create a line with left-aligned text.
fill_text_box(input_text='', fill=' ', sp='|', max_length=60)
¶
Put the string at the center of | |.